source: branches/branch-1-0/src/helpers/xmldefs.c@ 368

Last change on this file since 368 was 97, checked in by umoeller, 24 years ago

XML updates.

  • Property svn:eol-style set to CRLF
  • Property svn:keywords set to Author Date Id Revision
File size: 38.6 KB
Line 
1
2/*
3 *@@sourcefile xmldefs.c:
4 * this file is just for xdoc and contains glossary items for
5 * XML. It is never compiled.
6 *
7 *@@added V0.9.6 (2000-10-29) [umoeller]
8 */
9
10/*
11 * Copyright (C) 2001 Ulrich M”ller.
12 * This file is part of the "XWorkplace helpers" source package.
13 * This is free software; you can redistribute it and/or modify
14 * it under the terms of the GNU General Public License as published
15 * by the Free Software Foundation, in version 2 as it comes in the
16 * "COPYING" file of the XWorkplace main distribution.
17 * This program is distributed in the hope that it will be useful,
18 * but WITHOUT ANY WARRANTY; without even the implied warranty of
19 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
20 * GNU General Public License for more details.
21 */
22
23/*
24 *@@gloss: expat expat
25 * Expat is one of the most well-known XML processors (parsers).
26 * I (umoeller) have ported expat to the XWorkplace Helpers
27 * library. See xmlparse.c for an introduction to expat. See
28 * xml.c for an introduction to XML support in the XWorkplace
29 * Helpers in general.
30 */
31
32
33/*
34 *@@gloss: XML XML
35 * XML is the Extensible Markup Language, as defined by
36 * the W3C. XML isn't really a language, but a meta-language
37 * for describing markup languages. It is a simplified subset
38 * of SGML.
39 *
40 * You should be familiar with the following:
41 *
42 * -- XML parsers operate on XML @documents.
43 *
44 * -- Each XML document has both a physical and a logical
45 * structure.
46 *
47 * Physically, the document is composed of units called
48 * @entities.
49 *
50 * Logically, the document is composed of @markup and
51 * @content. Among other things, markup separates the content
52 * into @elements.
53 *
54 * -- The logical and physical structures must nest properly (be
55 * @well-formed) for each entity, which results in the entire
56 * XML document being well-formed as well.
57 */
58
59/*
60 *@@gloss: entities entities
61 * An "entity" is an XML storage unit. It's a very abstract
62 * concept, and the term doesn't make much sense, but it was
63 * in SGML already, and XML chose to inherit it.
64 *
65 * In the simplest case, an XML document has only one entity,
66 * which is an XML file (or memory buffer from wherever).
67 * The document entity serves as the root of the entity tree
68 * and a starting-point for an XML processor. Unlike other
69 * entities, the document entity has no name and might well
70 * appear on a processor input stream without any identification
71 * at all.
72 *
73 * Entities are defined to be either parsed or unparsed.
74 *
75 * Other than that, there are @internal_entities,
76 * @external_entities, and @parameter_entities.
77 *
78 * See @entity_references for how to reference entities.
79 */
80
81/*
82 *@@gloss: entity_references entity references
83 * An "entity reference" refers to the content of a named
84 * entity (see: @entities). It is included in "&amp" and ";"
85 * characters.
86 *
87 * If you declare @internal_entities in the @DTD, referencing
88 * them allows for text replacements as in SGML:
89 *
90 + This document was prepared on &PrepDate;.
91 *
92 * The same works for @external_entities though. Assuming
93 * that "SecondFile" has been declared in the DTD to point
94 * to another file,
95 *
96 + See the following README: &SecondFile;
97 *
98 * would then insert the complete contents of the second
99 * file into the document. The XML processor will parse
100 * that file as if it were at that position in the original
101 * document.
102 *
103 * An entity is "included" when its replacement text
104 * is retrieved and processed, in place of the reference itself,
105 * as though it were part of the document at the location the
106 * reference was recognized.
107 * The replacement text may contain
108 * both @content and (except for @parameter_entities)
109 * @markup, which must be recognized in the usual way, except
110 * that the replacement text of entities used to escape markup
111 * delimiters (the entities amp, lt, gt, apos, quot) is always
112 * treated as data. (The string "AT&T;" expands to "AT&T;"
113 * and the remaining ampersand is not recognized as an
114 * entity-reference delimiter.) A @character_reference is
115 * included when the indicated character is processed in
116 * place of the reference itself.
117 *
118 * The following are forbidden, and constitute fatal errors:
119 *
120 * -- the appearance of a reference to an unparsed entity;
121 *
122 * -- the appearance of any character or general-entity reference
123 * in the @DTD except within an EntityValue or AttValue;
124 *
125 * -- a reference to an external entity in an attribute value.
126 */
127
128/*
129 *@@gloss: internal_entities internal entities
130 * An "internal entity" has no separate physical storage.
131 * Its contents appear in the document's @DTD as an
132 * @entity_declaration, like this:
133 *
134 + <!ENTITY PrepDate "Feb 11, 2001">
135 *
136 * This can later be referenced with @entity_references
137 * and allows you to define shortcuts for frequently typed
138 * text or text that is expected to change, such as the
139 * revision status of a document.
140 *
141 * XML has five built-in internal entities:
142 *
143 * -- "&amp;amp;" refers to the ampersand ("&amp") character,
144 * which normally introduces @markup and can therefore
145 * only be literally used in @comments, @processing_instructions,
146 * or @CDATA sections. This is also legal within the literal
147 * entity value of declarations of internal entities.
148 *
149 * -- "&amp;lt;" and "&amp;gt;" refer to the angle brackets
150 * ("&lt;", "&gt;") which normally introduce @elements.
151 * They must be escaped unless used in a @CDATA section.
152 *
153 * -- To allow values in @attributes to contain both single and double
154 * quotes, the apostrophe or single-quote character (') may be
155 * represented as "&amp;apos;", and the double-quote character
156 * (") as "&amp;quot;".
157 *
158 * A numeric @character_reference is a special case of an entity reference.
159 *
160 * An internal entity is always parsed.
161 *
162 * Also see @entities.
163 */
164
165/*
166 *@@gloss: parameter_entities parameter entities
167 * Parameter entities can only be references in the @DTD.
168 * A parameter entity is identified by placing "% " (percent-space)
169 * in front of its name in the declaration. The percent sign is
170 * also used in references to parameter entities, instead of the
171 * ampersand. Parameter entity references are immediately expanded
172 * in the DTD and their replacement text is
173 * part of the declaration, whereas normal @entity_references are not
174 * expanded.
175 */
176
177/*
178 *@@gloss: external_entities external entities
179 * As opposed to @internal_entities, "external entities" refer
180 * to different storage.
181 *
182 * They must have a "system ID" with the URI specifying where
183 * the entity can be retrieved. Those URIs may be absolute
184 * or relative. Unless otherwise provided (e.g. by a special
185 * XML element type defined by a particular @DTD, or
186 * @processing_instructions defined by a particular application
187 * specification), relative URIs are relative to the location
188 * of the resource within which the entity declaration occurs.
189 *
190 * Optionally, external entities may specify a "public ID"
191 * as well. An XML processor attempting to retrieve the entity's
192 * content may use the public identifier to try to generate an
193 * alternative URI. If the processor is unable to do so, it must
194 * use the URI specified in the system literal. Before a match
195 * is attempted, all strings of @whitespace in the public
196 * identifier must be normalized to single space characters (#x20),
197 * and leading and trailing white space must be removed.
198 *
199 * An external entity is not always parsed.
200 *
201 * External entities allow an XML document to refer to an external
202 * file. External entities contain either text or binary data. If
203 * they contain text, the content of the external file is inserted
204 * at the point of reference and parsed as part of the referring
205 * document. Binary data is not parsed and may only be referenced
206 * in an attribute that has been declared as ENTITY or ENTITIES.
207 * Binary data is used to reference figures and
208 * other non-XML content in the document.
209 *
210 * Examples of external entity declarations:
211 +
212 + <!ENTITY open-hatch
213 + SYSTEM "http://www.textuality.com/boilerplate/OpenHatch.xml">
214 + <!ENTITY open-hatch
215 + PUBLIC "-//Textuality//TEXT Standard open-hatch boilerplate//EN"
216 + "http://www.textuality.com/boilerplate/OpenHatch.xml">
217 + <!ENTITY hatch-pic
218 + SYSTEM "../grafix/OpenHatch.gif" NDATA gif >
219 *
220 * Character @encoding is processed on a per-external-entity basis.
221 * As a result, each external parsed entity in an XML document may
222 * use a different encoding for its characters.
223 *
224 * In the document entity, the encoding declaration is part of the XML
225 * @text_declaration.
226 *
227 * Also see @entities.
228 */
229
230/*
231 *@@gloss: external_parsed_entities external parsed entities
232 * An external parsed entity is an external entity that has
233 * been parsed, which is not necessarily the case.
234 *
235 * See @external_entities.
236 */
237
238/*
239 *@@gloss: markup markup
240 * XML "markup" encodes a description of the @document's storage
241 * layout and logical structure.
242 *
243 * Markup is either @elements, @entity_references, @comments, @CDATA
244 * section delimiters, @DTD's, or @processing_instructions.
245 *
246 * XML "text" consists of markup and @content.
247 */
248
249/*
250 *@@gloss: whitespace whitespace
251 * In XML, "whitespace" consists of one or more space (0x20)
252 * characters, carriage returns, line feeds, or tabs.
253 *
254 * Whitespace handling in XML can vary. In @markup, this is
255 * used to separate the various @entities of course. However,
256 * in @content (i.e. non-markup), an application may
257 * or may not be interested in white space. Whitespace
258 * handling can therefore be handled differently for each
259 * element with the use of the special "xml:space" @attributes.
260 */
261
262/*
263 *@@gloss: character_reference character reference
264 * Character references escape Unicode characters. They are
265 * a special case of @entity_references.
266 *
267 * They may be used to refer to a specific character in the
268 * ISO/IEC 10646 character set, for example one not directly
269 * accessible from available input devices.
270 *
271 * If the character reference thus begins with "&amp;#x", the
272 * digits and letters up to the terminating ";" provide a
273 * hexadecimal representation of the character's code point in
274 * ISO/IEC 10646. If it begins just with "&amp;#", the
275 * digits up to the terminating ";" provide a decimal
276 * representation of the character's code point.
277 */
278
279/*
280 *@@gloss: content content
281 * XML "text" consists of @markup and "content" (the XML spec
282 * calls this "character data"). Content is simply everything
283 * that is not markup.
284 *
285 * To access characters that would either otherwise be recognized
286 * as @markup or are difficult to reach via the keyboard, XML
287 * allows for using a @character_reference.
288 *
289 * Within @elements, content is any string of
290 * characters which does not contain the start-delimiter of
291 * any markup. In a @CDATA section, content is any
292 * string of characters not including the CDATA-section-close
293 * delimiter, "]]>".
294 *
295 * The character @encodings may vary between @external_parsed_entities.
296 */
297
298/*
299 *@@gloss: names names
300 * In XML, a "name" is a token beginning with a letter or one of a
301 * few punctuation characters, and continuing with letters,
302 * digits, hyphens, underscores, colons, or full stops,
303 * together known as name characters. The colon has a
304 * special meaning with XML namespaces.
305 */
306
307/*
308 *@@gloss: elements elements
309 * Elements are the most common form of XML @markup.
310 * They are identified by their @names.
311 *
312 * As opposed to HTML, there are two types of elements:
313 *
314 * A non-empty element starts and ends with a start-tag
315 * and an end-tag:
316 *
317 + <LI>...</LI>
318 *
319 * As opposed to HTML, an empty element must have an
320 * empty-element tag:
321 *
322 + <P /> <IMG align="left" src="http://www.w3.org/Icons/WWW/w3c_home" />
323 *
324 * In addition, @attributes contains extra parameters to elements.
325 * If the element has attributes, they must be in the start-tag
326 * (or empty-element tag).
327 *
328 * For non-empty elements, the text between the start-tag
329 * and end-tag is called the element's content and may
330 * contain other elements, character data, an entity
331 * reference, a @CDATA section, a processing instruction,
332 * or a comment.
333 *
334 * The XML specs break this into "content particles".
335 *
336 * An element has "mixed content" when it may contain
337 * @content, optionally interspersed with child
338 * elements. In this case, the types of the child
339 * elements may be constrained by a documents @DTD, but
340 * not their order or their number of occurrences.
341 */
342
343/*
344 *@@gloss: attributes attributes
345 * "Attributes" are name-value pairs that have been associated
346 * with @elements. Attributes can only appear in start-tags
347 * or empty-tags.
348 *
349 * Attributes are identified by their @names. Each such
350 * identifier may only appear once per element.
351 *
352 * As opposed to HTML, attribute values must be quoted (either
353 * in single or double quotes). You may use a @character_reference
354 * to escape quotes in attribute values.
355 *
356 * Example of an attribute:
357 *
358 + <IMG SRC="mypic.gif" />
359 *
360 * SRC="mypic.gif" is the attribute here.
361 *
362 * There are a few <B>special attributes</B> defined by XML.
363 * In @valid documents, these attributes, like any other,
364 * must be declared if they are used. These attributes are
365 * recursive, i.e. they are considered to apply to all elements
366 * within the content of the element where they are specified,
367 * unless overridden in a sub-element.
368 *
369 * -- "xml:space" may be attached to an element to signal
370 * that @whitespace should be preserved for this element.
371 *
372 * The value "default" signals that applications' default
373 * whitespace processing modes are acceptable for this
374 * element; the value "preserve" indicates the intent that
375 * applications preserve all the white space.
376 *
377 * -- "xml:lang" may be inserted in documents to specify the
378 * language used in the contents and attribute values of
379 * any element in an XML document.
380 *
381 * The value is either a two-letter language code (e.g. "en")
382 * or a combination of language and country code. Interestingly,
383 * the English W3C XML spec gives the following examples:
384 *
385 + <p xml:lang="en">The quick brown fox jumps over the lazy dog.</p>
386 + <p xml:lang="en-GB">What colour is it?</p>
387 + <p xml:lang="en-US">What color is it?</p>
388 + <sp who="Faust" desc='leise' xml:lang="de">
389 + <l>Habe nun, ach! Philosophie,</l>
390 + <l>Juristerei, und Medizin</l>
391 + <l>und leider auch Theologie</l>
392 + <l>durchaus studiert mit heiáem Bemh'n.</l>
393 + </sp>
394 */
395
396/*
397 *@@gloss: comments comments
398 * Comments may appear anywhere in a document outside other
399 * markup; in addition, they may appear within the @DTD at
400 * places allowed by the grammar. They are not part of the
401 * document's @content; an XML processor may, but
402 * need not, make it possible for an application to retrieve
403 * the text of comments (@expat has a handler for this).
404 *
405 * Comments may contain any text except "--" (double-hyphen).
406 *
407 * Example of a comment:
408 *
409 + <!-- declarations for <head> & <body> -->
410 */
411
412/*
413 *@@gloss: CDATA CDATA
414 * CDATA sections can appear anywhere where @content
415 * is allowed. They are used to escape blocks of
416 * text containing characters which would otherwise be
417 * recognized as @markup.
418 *
419 * CDATA sections begin with the string &lt;![CDATA[ and end
420 * with the string ]]&gt;. Within a CDATA section, only the
421 * ]]&gt; string is recognized as @markup, so that left angle
422 * brackets and ampersands may occur in their literal form.
423 * They need not (and cannot) be escaped using "&amp;lt;" and
424 * "&amp;amp;". (This implies that not even @comments are
425 * recognized).
426 *
427 * CDATA sections cannot nest.
428 *
429 * Examples:
430 *
431 + <![CDATA[<greeting>Hello, world!</greeting>]]>
432 +
433 + <![CDATA[
434 + *p = &q;
435 + b = (i <= 3);
436 + ]]>
437 */
438
439/*
440 *@@gloss: processing_instructions processing instructions
441 * "Processing instructions" (PIs) contain additional
442 * data for applications.
443 *
444 * Like @comments, they are not textually part of the XML
445 * document, but the XML processor is required to pass
446 * them to an application.
447 *
448 * PIs have the form:
449 *
450 + <?name pidata?>
451 *
452 *
453 * The "name", called the PI "target", identifies the PI to
454 * the application. Applications should process only the
455 * targets they recognize and ignore all other PIs. Any
456 * data that follows the PI target is optional, it is for
457 * the application that recognizes the target. The names
458 * used in PIs may be declared in a @notation_declaration in order to
459 * formally identify them.
460 *
461 * PI names beginning with "xml" are reserved.
462 */
463
464/*
465 *@@gloss: well-formed well-formed
466 * XML @documents (the sum of all @entities) are "well-formed"
467 * if the following conditions are met (among others):
468 *
469 * -- They contain one or more @elements.
470 *
471 * -- There is exactly one element, called the root, or document
472 * element, no part of which appears in the @content of any
473 * other element.
474 *
475 * -- For all other elements, if the start-tag is in the content
476 * of another element, the end-tag is in the content of the
477 * same element. More simply stated, the elements nest
478 * properly within each other. (This is unlike HTML.)
479 *
480 * -- Values of string @attributes cannot contain references to
481 * @external_entities.
482 *
483 * -- No attribute may appear more than once in the same element.
484 *
485 * -- All entities except the amp, lt, gt, apos, and quot must be
486 * declared before they are used. Binary @external_entities
487 * cannot be referenced in the flow of @content, it can only
488 * be used in an attribute declared as ENTITY or ENTITIES.
489 *
490 * -- Neither text nor @parameter_entities are allowed to be
491 * recursive, directly or indirectly.
492 */
493
494/*
495 *@@gloss: valid valid
496 * XML @documents are said to be "valid" if they have a @DTD
497 * associated and they confirm to it. While XML documents
498 * must always be @well-formed, validation and validity is up
499 * to the implementation (i.e. at option to the application).
500 *
501 * Validating processors must report violations of the constraints
502 * expressed by the declarations in the @DTD, and failures to
503 * fulfill the validity constraints given in this specification.
504 * To accomplish this, validating XML processors must read and
505 * process the entire DTD and all @external_parsed_entities
506 * referenced in the document.
507 *
508 * Non-validating processors (such as @expat) are required to
509 * check only the document entity (see @entitites), including the
510 * entire internal DTD subset, for whether it is @well-formed.
511 *
512 * While they are not required to check the document for validity,
513 * they are required to process all the declarations they
514 * read in the internal DTD subset and in any parameter entity
515 * that they read, up to the first reference to a parameter
516 * entity that they do not read; that is to say, they must
517 * use the information in those declarations to normalize
518 * values of @attributes, include the replacement text of
519 * @internal_entities, and supply default attribute values.
520 * They must not process entity declarations or attribute-list
521 * declarations encountered after a reference to a
522 * parameter entity that is not read, since the entity may have
523 * contained overriding declarations.
524 */
525
526/*
527 *@@gloss: encodings encodings
528 * XML supports a wide variety of character encodings. These
529 * must be specified in the XML @text_declaration.
530 *
531 * There are too many character encodings on the planet to
532 * be listed here. The most common ones are:
533 *
534 * -- "UTF-8", "UTF-16", "ISO-10646-UCS-2", and "ISO-10646-UCS-4"
535 * should be used for the various encodings and transformations
536 * of Unicode / ISO/IEC 10646.
537 *
538 * -- "ISO-8859-x" (with "x" being a number from 1 to 9) represent
539 * the various ISO 8859 ("Latin") encodings.
540 *
541 * -- "ISO-2022-JP", "Shift_JIS", and "EUC-JP" should be used for
542 * the various encoded forms of JIS X-0208-1997.
543 *
544 * Example of a @text_declaration:
545 *
546 + <?xml version="1.0" encoding="ISO-8859-2"?>
547 *
548 * All XML processors must be able to read @entities in either
549 * UTF-8 or UTF-16. @expat directly supports the following
550 * (see XML_SetUnknownEncodingHandler):
551 *
552 * -- UTF-8: 8-bit encoding of Unicode.
553 *
554 * -- UTF-16: 16-bit encoding of Unicode.
555 *
556 * -- ISO-8859-1: that's "latin 1".
557 *
558 * -- US-ASCII.
559 *
560 * Entities encoded in UTF-16 must begin with the ZERO WIDTH NO-BREAK
561 * SPACE character, #xFEFF). This is an encoding signature, not part
562 * of either the @markup or the @content of the XML @document.
563 * XML processors must be able to use this character to differentiate
564 * between UTF-8 and UTF-16 encoded documents.
565 */
566
567/*
568 *@@gloss: text_declaration text declaration
569 * XML @documents and @external_parsed_entities may (and
570 * should) start with the XML text declaration, exactly like
571 * this:
572 *
573 + <?xml version="1.0" encoding="enc"?>
574 *
575 * where "1.0" is the only currently defined XML version
576 * and "enc" must be the encoding of the document.
577 *
578 * External parsed entities may begin with a text declaration,
579 * which looks like an XML declaration with just an encoding
580 * declaration:
581 *
582 + <?xml encoding="Big5"?>
583 *
584 * See @encodings.
585 *
586 * Example:
587 *
588 + <?xml version="1.0" encoding="ISO-8859-1"?>
589 */
590
591/*
592 *@@gloss: documents documents
593 * XML documents are made up of storage units called @entities,
594 * which contain either parsed or unparsed data. Parsed data is
595 * made up of characters, some of which form @content,
596 * and some of which form @markup.
597 *
598 * XML documents should start the with the XML @text_declaration.
599 *
600 * The function of the @markup in an XML document is to describe
601 * its storage and logical structure and to associate attribute-value
602 * pairs with its logical structures. XML provides a mechanism,
603 * the document type declaration (@DTD), to define constraints
604 * on the logical structure and to support the use of predefined
605 * storage units.
606 *
607 * A data object is an XML document if it is @well-formed.
608 * A well-formed XML document may in addition be @valid if it
609 * meets certain further constraints.
610 *
611 * A very simple XML document looks like this:
612 *
613 + <?xml version="1.0"?>
614 + <oldjoke>
615 + <burns>Say <quote>goodnight</quote>, Gracie.</burns>
616 + <allen><quote>Goodnight, Gracie.</quote></allen>
617 + <applause/>
618 + </oldjoke>
619 *
620 * This document is @well-formed, but not @valid (because it
621 * has no @DTD).
622 *
623 */
624
625/*
626 *@@gloss: element_declaration element declaration
627 * Element declarations identify the @names of elements and the
628 * nature of their content. They look like this:
629 +
630 + <!ELEMENT name contentspec>
631 +
632 * No element may be declared more than once.
633 *
634 * The "name" of the element is obvious. The "contentspec"
635 * is not. This specifies what may appear in the element
636 * and can be one of the following:
637 *
638 * -- "EMPTY" marks the element as being empty (i.e.
639 * having no content at all).
640 *
641 * -- "ANY" does not impose any restrictions.
642 *
643 * -- (mixed): a "list" which declares the element to have
644 * mixed content. See below.
645 *
646 * -- (children): a "list" which declares the element to
647 * have child elements only, but no content. See below.
648 *
649 * <B>(mixed): content with elements</B>
650 *
651 * With the (mixed) contentspec, an element may either contain
652 * @content only or @content with subelements.
653 *
654 * While the (children) contentspec allows you to define sequences
655 * and orders, this is not possible with (mixed).
656 *
657 * "contentspec" must then be a pair of parentheses, optionally
658 * followed by "*". In the brackets, there must be at least the
659 * keyword "#PCDATA", optionally followed by "|" and element
660 * names. Note that if no #PCDATA appears, the (children) model
661 * is assumed (see below).
662 *
663 * Examples:
664 *
665 + <!ELEMENT name (#PCDATA)* >
666 + <!ELEMENT name (#PCDATA | subname1 | subname2)* >
667 + <!ELEMENT name (#PCDATA) >
668 *
669 * Note that if you specify sub-element names, you must terminate
670 * the contentspec with "*". Again, there's no way to specify
671 * orders etc. with (mixed).
672 *
673 * <B>(children): Element content only</B>
674 *
675 * With the (children) contentspec, an element may contain
676 * only other elements (and @whitespace), but no other @content.
677 *
678 * This can become fairly complicated. "contentspec" then must be
679 * a "list" followed by a "repeater".
680 *
681 * A "repeater" can be:
682 *
683 * -- Nothing: the preceding item _must_ appear exactly once.
684 *
685 * -- "+": the preceding item _must_ appear at _least_ once.
686 *
687 * -- "?": the preceding item _may_ appear exactly once.
688 *
689 * -- "*": the preceding item _may_ appear once or more than
690 * once or not at all.
691 *
692 * Here's the most simple example (precluding that "SUBELEMENT"
693 * is a valid "list" here):
694 *
695 + <!ELEMENT name (SUBELEMENT)* >
696 *
697 * In other words, in (children) mode, "contentspec" must always
698 * be in brackets and is followed by a "repeater" (which can be
699 * nothing).
700 *
701 * About "lists"... since these declarations may nest, this is
702 * where the recursive definition of a "content particle" comes
703 * in:
704 *
705 * -- A "content particle" is either a sub-element name or
706 * a nested list, followed by a "repeater".
707 *
708 * -- A "list" is defined as an enumeration of content particles,
709 * enclosed in parentheses, where the content particles are
710 * separated by "connectors".
711 *
712 * There are two types of "connectors":
713 *
714 * -- Commas (",") indicate that the elements must appear
715 * in the specified order ("sequence").
716 *
717 * -- Vertical bars ("|") specify that the elements may
718 * occur alternatively ("choice").
719 *
720 * The connectors cannot be mixed; the list must be
721 * either completely "sequence" or "choice".
722 *
723 * Examples of content particles:
724 *
725 + SUBELEMENT+
726 + list*
727 *
728 * Examples of lists:
729 *
730 + ( cp | cp | cp | cp )
731 + ( cp , cp , cp , cp )
732 *
733 * Full examples for (children):
734 *
735 + <!ELEMENT oldjoke ( burns+, allen, applause? ) >
736 + | | +cp-+ | |
737 + | | | |
738 + | +------- list ---------+ |
739 + +-------contentspec--------+
740 *
741 * This specifies a "seqlist" for the "oldjoke" element. The
742 * list is not nested, so the content particles are element
743 * names only.
744 *
745 * Within "oldjoke", "burns" must appear first and can appear
746 * once or several times.
747 *
748 * Next must be "allen", exactly once (since there's no repeater).
749 *
750 * Optionally ("?"), there can be "applause" at the end.
751 *
752 * Now, a nested example:
753 *
754 + <!ELEMENT poem (title?, (stanza+ | couplet+ | line+) ) >
755 *
756 * That is, a poem consists of an optional title, followed by one or
757 * several stanzas, or one or several couplets, or one or several lines.
758 * This is different from:
759 *
760 + <!ELEMENT poem (title?, (stanza | couplet | line)+ ) >
761 *
762 * The latter allows for a single poem to contain a mixture of stanzas,
763 * couplets or lines.
764 *
765 * And for WarpIN:
766 *
767 + <!ELEMENT WARPIN (REXX*, VARPROMPT*, MSG?, TITLE?, (GROUP | PCK)+), PAGE+) >
768 *
769 */
770
771/*
772 *@@gloss: attribute_declaration attribute declaration
773 * Attribute declarations identify the @names of attributes
774 * of @elements and their possible values. They look like this:
775 *
776 + <!ATTLIST elementname
777 + attname atttype defaultvalue
778 + attname atttype defaultvalue
779 + ... >
780 *
781 * "elementname" is the element name for which the
782 * attributes are being defined.
783 *
784 * For each attribute, you must then specify three
785 * columns:
786 *
787 * -- "attname" is the attribute name.
788 *
789 * -- "atttype" is the attribute type (one of six values,
790 * see below).
791 *
792 * -- "defaultvalue" specifies the default value.
793 *
794 * The attribute type (specifying the value type) must be
795 * one of six:
796 *
797 * -- "CDATA" is any character data. (This has nothing to
798 * do with @CDATA sections.)
799 *
800 * -- "ID": the value must be a unique @name among the
801 * document. Only one such attribute is allowed per
802 * element.
803 *
804 * -- "IDREF" or "IDREFS": a reference to some other
805 * element which has an "ID" attribute with this value.
806 * "IDREFS" is the plural and may contain several of
807 * those separated by @whitespace.
808 *
809 * -- "ENTITY" or "ENTITIES": a reference to some an
810 * external entity (see @external_entities).
811 * "ENTITIES" is the plural and may contain several of
812 * those separated by @whitespace.
813 *
814 * -- "NMTOKEN" or "NMTOKENS": a single-word string.
815 * This is not a reference though.
816 * "NMTOKENS" is the plural and may contain several of
817 * those separated by @whitespace.
818 *
819 * -- an enumeration: an explicit list of allowed
820 * values for this attribute. Additionally, you can specify
821 * that the names must match a particular @notation_declaration.
822 *
823 * The "defaultvalue" (third column) can be one of these:
824 *
825 * -- "#REQUIRED": the attribute may not be omitted.
826 *
827 * -- "#IMPLIED": the attribute is optional, and there's
828 * no default value.
829 *
830 * -- "'value'": the attribute is optional, and it has
831 * this default.
832 *
833 * -- "#FIXED 'value'": the attribute is optional, but if
834 * it appears, it must have this value.
835 *
836 * Example:
837 *
838 + <!ATTLIST oldjoke
839 + name ID #REQUIRED
840 + label CDATA #IMPLIED
841 + status ( funny | notfunny ) 'funny'>
842 */
843
844/*
845 *@@gloss: entity_declaration entity declaration
846 * Entity declarations define @entities.
847 *
848 * An example of @internal_entities:
849 *
850 + <!ENTITY ATI "ArborText, Inc.">
851 *
852 * Examples of @external_entities:
853 *
854 + <!ENTITY boilerplate SYSTEM "/standard/legalnotice.xml">
855 + <!ENTITY ATIlogo SYSTEM "/standard/logo.gif" NDATA GIF87A>
856 */
857
858/*
859 *@@gloss: notation_declaration notation declaration
860 * Notation declarations identify specific types of external
861 * binary data. This information is passed to the processing
862 * application, which may make whatever use of it it wishes.
863 *
864 * Example:
865 *
866 + <!NOTATION GIF87A SYSTEM "GIF">
867 */
868
869/*
870 *@@gloss: DTD DTD
871 * The XML document type declaration contains or points to
872 * markup declarations that provide a grammar for a class of @documents.
873 * This grammar is known as a Document Type Definition, or DTD.
874 *
875 * The DTD must look like the following:
876 *
877 + <!DOCTYPE name ... >
878 *
879 * "name" must match the document's root element.
880 *
881 * "..." can be the reference to an external subset (being a special
882 * case of @external_entities):
883 *
884 + <!DOCTYPE name SYSTEM "whatever.dtd">
885 *
886 * The SYSTEM identifier is required with XML, while a public
887 * identifier is not. (In SGML, neither is required, but at
888 * least one must be present.)
889 *
890 * Alternatively,specify an internal subset in brackets, which
891 * contains the markup directly:
892 *
893 + <!DOCTYPE name [
894 + <!ELEMENT greeting (#PCDATA)>
895 + ]>
896 *
897 * You can even mix both.
898 *
899 * A markup declaration is either an @element_declaration, an
900 * @attribute_declaration, an @entity_declaration,
901 * or a @notation_declaration. These declarations may be contained
902 * in whole or in part within @parameter_entities.
903 */
904
905/*
906 *@@gloss: DOM DOM
907 * DOM is the "Document Object Model", as defined by the W3C.
908 *
909 * The DOM is a programming interface for @XML @documents.
910 * (XML is a metalanguage and describes the documents
911 * themselves. DOM is a programming interface -- an API --
912 * to access XML documents.)
913 *
914 * The W3C calls this "a platform- and language-neutral
915 * interface that allows programs and scripts to dynamically
916 * access and update the content, structure and style of
917 * documents. The Document Object Model provides
918 * a standard set of objects for representing HTML and XML
919 * documents, a standard model of how these objects can
920 * be combined, and a standard interface for accessing and
921 * manipulating them. Vendors can support the DOM as an
922 * interface to their proprietary data structures and APIs,
923 * and content authors can write to the standard DOM
924 * interfaces rather than product-specific APIs, thus
925 * increasing interoperability on the Web."
926 *
927 * In short, DOM specifies that an XML document is broken
928 * up into a tree of "nodes", representing the various parts
929 * of an XML document. Such nodes represent @documents,
930 * @elements, @attributes, @processing_instructions,
931 * @comments, @content, and more.
932 *
933 * See xml.c for an introduction to XML and DOM support in
934 * the XWorkplace helpers.
935 *
936 * Example: Take this HTML table definition:
937 +
938 + <TABLE>
939 + <TBODY>
940 + <TR>
941 + <TD>Column 1-1</TD>
942 + <TD>Column 1-2</TD>
943 + </TR>
944 + <TR>
945 + <TD>Column 2-1</TD>
946 + <TD>Column 2-2</TD>
947 + </TR>
948 + </TBODY>
949 + </TABLE>
950 *
951 * In the DOM, this would be represented by a tree as follows:
952 +
953 + ÚÄÄÄÄÄÄÄÄÄÄÄÄ¿
954 + ³ TABLE ³ (only ELEMENT node in root DOCUMENT node)
955 + ÀÄÄÄÄÄÂÄÄÄÄÄÄÙ
956 + ³
957 + ÚÄÄÄÄÄÁÄÄÄÄÄÄ¿
958 + ³ TBODY ³ (only ELEMENT node in root "TABLE" node)
959 + ÀÄÄÄÄÄÂÄÄÄÄÄÄÙ
960 + ÚÄÄÄÄÄÄÄÄÄÄÄÁÄÄÄÄÄÄÄÄÄÄÄ¿
961 + ÚÄÄÄÄÄÁÄÄÄÄÄÄ¿ ÚÄÄÄÄÄÁÄÄÄÄÄÄ¿
962 + ³ TR ³ ³ TR ³
963 + ÀÄÄÄÄÄÂÄÄÄÄÄÄÙ ÀÄÄÄÄÄÂÄÄÄÄÄÄÙ
964 + ÚÄÄÄÁÄÄÄÄÄÄ¿ ÚÄÄÄÁÄÄÄÄÄÄ¿
965 + ÚÄÄÄÁÄ¿ ÚÄÄÁÄÄ¿ ÚÄÄÄÁÄ¿ ÚÄÄÁÄÄ¿
966 + ³ TD ³ ³ TD ³ ³ TD ³ ³ TD ³
967 + ÀÄÄÂÄÄÙ ÀÄÄÂÄÄÙ ÀÄÄÄÂÄÙ ÀÄÄÂÄÄÙ
968 + ÉÍÍÍÍÍÊÍÍÍÍ» ÉÍÍÍÍÊÍÍÍÍÍ» ÉÍÍÍÍÊÍÍÍÍÍ» ÉÍÍÊÍÍÍÍÍÍÍ»
969 + ºColumn 1-1º ºColumn 1-2º ºColumn 2-1º ºColumn 2-2º (one TEXT node in each parent node)
970 + ÈÍÍÍÍÍÍÍÍÍÍŒ ÈÍÍÍÍÍÍÍÍÍÍŒ ÈÍÍÍÍÍÍÍÍÍÍŒ ÈÍÍÍÍÍÍÍÍÍÍŒ
971 */
972
973/*
974 *@@gloss: DOM_DOCUMENT DOCUMENT
975 * representation of XML @documents in the @DOM.
976 *
977 * The xwphelpers implementation has the following differences
978 * to the DOM specs:
979 *
980 * -- The "doctype" member points to the documents @DTD, or is NULL.
981 * In our implementation, this is the pvExtra pointer, which points
982 * to a _DOMDTD.
983 *
984 * -- The "implementation" member points to a DOMImplementation object.
985 * This is not supported here.
986 *
987 * -- The "documentElement" member is a convenience pointer to the
988 * document's root element. We don't supply this field; instead,
989 * the llChildren list only contains a single ELEMENT node for the
990 * root element.
991 *
992 * -- The "createElement" method is implemented by xmlCreateElementNode.
993 *
994 * -- The "createAttribute" method is implemented by xmlCreateAttributeNode.
995 *
996 * -- The "createTextNode" method is implemented by xmlCreateTextNode,
997 * which has an extra parameter though.
998 *
999 * -- The "createComment" method is implemented by xmlCreateCommentNode.
1000 *
1001 * -- The "createProcessingInstruction" method is implemented by
1002 * xmlCreatePINode.
1003 *
1004 * -- The "createDocumentFragment", "createCDATASection", and
1005 * "createEntityReference" methods are not supported.
1006 */
1007
1008
Note: See TracBrowser for help on using the repository browser.