Context Navigation

source: trunk/src/helpers/xmldefs.c@ 46

Visit:

Last change on this file since 46 was 39, checked in by umoeller, 24 years ago
Misc. fixes.
Property svn:eol-style set to `CRLF` Property svn:keywords set to `Author Date Id Revision`
File size: 38.2 KB

Line
1
2	/*
3	*@@sourcefile xmldefs.c:
4	* this file is just for xdoc and contains glossary items for
5	* XML. It is never compiled.
6	*
7	*@@added V0.9.6 (2000-10-29) [umoeller]
8	*/
9
10	/*
11	* Copyright (C) 2001 Ulrich Mller.
12	* This file is part of the "XWorkplace helpers" source package.
13	* This is free software; you can redistribute it and/or modify
14	* it under the terms of the GNU General Public License as published
15	* by the Free Software Foundation, in version 2 as it comes in the
16	* "COPYING" file of the XWorkplace main distribution.
17	* This program is distributed in the hope that it will be useful,
18	* but WITHOUT ANY WARRANTY; without even the implied warranty of
19	* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
20	* GNU General Public License for more details.
21	*/
22
23	/*
24	*@@gloss: expat expat
25	* Expat is one of the most well-known XML processors (parsers).
26	* I (umoeller) have ported expat to the XWorkplace Helpers
27	* library. See xmlparse.c for an introduction to expat. See
28	* xml.c for an introduction to XML support in the XWorkplace
29	* Helpers in general.
30	*/
31
32
33	/*
34	*@@gloss: XML XML
35	* XML is the Extensible Markup Language, as defined by
36	* the W3C. XML isn't really a language, but a meta-language
37	* for describing markup languages. It is a simplified subset
38	* of SGML.
39	*
40	* You should be familiar with the following:
41	*
42	* -- XML parsers operate on XML @documents.
43	*
44	* -- Each XML document has both a physical and a logical
45	* structure.
46	*
47	* Physically, the document is composed of units called
48	* @entities.
49	*
50	* Logically, the document is composed of @markup and
51	* @content. Among other things, markup separates the content
52	* into @elements.
53	*
54	* -- The logical and physical structures must nest properly (be
55	* @well-formed) for each entity, which results in the entire
56	* XML document being well-formed as well.
57	*/
58
59	/*
60	*@@gloss: entities entities
61	* An "entity" is an XML storage unit. It's a very abstract
62	* concept, and the term doesn't make much sense, but it was
63	* in SGML already, and XML chose to inherit it.
64	*
65	* In the simplest case, an XML document has only one entity,
66	* which is an XML file (or memory buffer from wherever).
67	* The document entity serves as the root of the entity tree
68	* and a starting-point for an XML processor. Unlike other
69	* entities, the document entity has no name and might well
70	* appear on a processor input stream without any identification
71	* at all.
72	*
73	* Entities are defined to be either parsed or unparsed.
74	*
75	* Other than that, there are @internal_entities,
76	* @external_entities, and @parameter_entities.
77	*
78	* See @entity_references for how to reference entities.
79	*/
80
81	/*
82	*@@gloss: entity_references entity references
83	* An "entity reference" refers to the content of a named
84	* entity (see: @entities). It is included in "&amp" and ";"
85	* characters.
86	*
87	* If you declare @internal_entities in the @DTD, referencing
88	* them allows for text replacements as in SGML:
89	*
90	+ This document was prepared on &PrepDate;.
91	*
92	* The same works for @external_entities though. Assuming
93	* that "SecondFile" has been declared in the DTD to point
94	* to another file,
95	*
96	+ See the following README: &SecondFile;
97	*
98	* would then insert the complete contents of the second
99	* file into the document. The XML processor will parse
100	* that file as if it were at that position in the original
101	* document.
102	*
103	* An entity is "included" when its replacement text
104	* is retrieved and processed, in place of the reference itself,
105	* as though it were part of the document at the location the
106	* reference was recognized.
107	* The replacement text may contain
108	* both @content and (except for @parameter_entities)
109	* @markup, which must be recognized in the usual way, except
110	* that the replacement text of entities used to escape markup
111	* delimiters (the entities amp, lt, gt, apos, quot) is always
112	* treated as data. (The string "AT&T;" expands to "AT&T;"
113	* and the remaining ampersand is not recognized as an
114	* entity-reference delimiter.) A @character_reference is
115	* included when the indicated character is processed in
116	* place of the reference itself.
117	*
118	* The following are forbidden, and constitute fatal errors:
119	*
120	* -- the appearance of a reference to an unparsed entity;
121	*
122	* -- the appearance of any character or general-entity reference
123	* in the @DTD except within an EntityValue or AttValue;
124	*
125	* -- a reference to an external entity in an attribute value.
126	*/
127
128	/*
129	*@@gloss: internal_entities internal entities
130	* An "internal entity" has no separate physical storage.
131	* Its contents appear in the document's @DTD as an
132	* @entity_declaration, like this:
133	*
134	+ <!ENTITY PrepDate "Feb 11, 2001">
135	*
136	* This can later be referenced with @entity_references
137	* and allows you to define shortcuts for frequently typed
138	* text or text that is expected to change, such as the
139	* revision status of a document.
140	*
141	* XML has five built-in internal entities:
142	*
143	* -- "&amp;" refers to the ampersand ("&amp") character,
144	* which normally introduces @markup and can therefore
145	* only be literally used in @comments, @processing_instructions,
146	* or @CDATA sections. This is also legal within the literal
147	* entity value of declarations of internal entities.
148	*
149	* -- "&lt;" and "&gt;" refer to the angle brackets
150	* ("<", ">") which normally introduce @elements.
151	* They must be escaped unless used in a @CDATA section.
152	*
153	* -- To allow values in @attributes to contain both single and double
154	* quotes, the apostrophe or single-quote character (') may be
155	* represented as "&apos;", and the double-quote character
156	* (") as "&quot;".
157	*
158	* A numeric @character_reference is a special case of an entity reference.
159	*
160	* An internal entity is always parsed.
161	*
162	* Also see @entities.
163	*/
164
165	/*
166	*@@gloss: parameter_entities parameter entities
167	* Parameter entities can only be references in the @DTD.
168	* A parameter entity is identified by placing "% " (percent-space)
169	* in front of its name in the declaration. The percent sign is
170	* also used in references to parameter entities, instead of the
171	* ampersand. Parameter entity references are immediately expanded
172	* in the DTD and their replacement text is
173	* part of the declaration, whereas normal @entity_references are not
174	* expanded.
175	*/
176
177	/*
178	*@@gloss: external_entities external entities
179	* As opposed to @internal_entities, "external entities" refer
180	* to different storage.
181	*
182	* They must have a "system ID" with the URI specifying where
183	* the entity can be retrieved. Those URIs may be absolute
184	* or relative. Unless otherwise provided (e.g. by a special
185	* XML element type defined by a particular @DTD, or
186	* @processing_instructions defined by a particular application
187	* specification), relative URIs are relative to the location
188	* of the resource within which the entity declaration occurs.
189	*
190	* Optionally, external entities may specify a "public ID"
191	* as well. An XML processor attempting to retrieve the entity's
192	* content may use the public identifier to try to generate an
193	* alternative URI. If the processor is unable to do so, it must
194	* use the URI specified in the system literal. Before a match
195	* is attempted, all strings of @whitespace in the public
196	* identifier must be normalized to single space characters (#x20),
197	* and leading and trailing white space must be removed.
198	*
199	* An external entity is not always parsed.
200	*
201	* External entities allow an XML document to refer to an external
202	* file. External entities contain either text or binary data. If
203	* they contain text, the content of the external file is inserted
204	* at the point of reference and parsed as part of the referring
205	* document. Binary data is not parsed and may only be referenced
206	* in an attribute that has been declared as ENTITY or ENTITIES.
207	* Binary data is used to reference figures and
208	* other non-XML content in the document.
209	*
210	* Examples of external entity declarations:
211	+
212	+ <!ENTITY open-hatch
213	+ SYSTEM "http://www.textuality.com/boilerplate/OpenHatch.xml">
214	+ <!ENTITY open-hatch
215	+ PUBLIC "-//Textuality//TEXT Standard open-hatch boilerplate//EN"
216	+ "http://www.textuality.com/boilerplate/OpenHatch.xml">
217	+ <!ENTITY hatch-pic
218	+ SYSTEM "../grafix/OpenHatch.gif" NDATA gif >
219	*
220	* Character @encoding is processed on a per-external-entity basis.
221	* As a result, each external parsed entity in an XML document may
222	* use a different encoding for its characters.
223	*
224	* In the document entity, the encoding declaration is part of the XML
225	* @text_declaration.
226	*
227	* Also see @entities.
228	*/
229
230	/*
231	*@@gloss: external_parsed_entities external parsed entities
232	* An external parsed entity is an external entity that has
233	* been parsed, which is not necessarily the case.
234	*
235	* See @external_entities.
236	*/
237
238	/*
239	*@@gloss: markup markup
240	* XML "markup" encodes a description of the @document's storage
241	* layout and logical structure.
242	*
243	* Markup is either @elements, @entity_references, @comments, @CDATA
244	* section delimiters, @DTD's, or @processing_instructions.
245	*
246	* XML "text" consists of markup and @content.
247	*/
248
249	/*
250	*@@gloss: whitespace whitespace
251	* In XML, "whitespace" consists of one or more space (0x20)
252	* characters, carriage returns, line feeds, or tabs.
253	*
254	* Whitespace handling in XML can vary. In @markup, this is
255	* used to separate the various @entities of course. However,
256	* in @content (i.e. non-markup), an application may
257	* or may not be interested in white space. Whitespace
258	* handling can therefore be handled differently for each
259	* element with the use of the special "xml:space" @attributes.
260	*/
261
262	/*
263	*@@gloss: character_reference character reference
264	* Character references escape Unicode characters. They are
265	* a special case of @entity_references.
266	*
267	* They may be used to refer to a specific character in the
268	* ISO/IEC 10646 character set, for example one not directly
269	* accessible from available input devices.
270	*
271	* If the character reference thus begins with "&#x", the
272	* digits and letters up to the terminating ";" provide a
273	* hexadecimal representation of the character's code point in
274	* ISO/IEC 10646. If it begins just with "&#", the
275	* digits up to the terminating ";" provide a decimal
276	* representation of the character's code point.
277	*/
278
279	/*
280	*@@gloss: content content
281	* XML "text" consists of @markup and "content" (the XML spec
282	* calls this "character data"). Content is simply everything
283	* that is not markup.
284	*
285	* To access characters that would either otherwise be recognized
286	* as @markup or are difficult to reach via the keyboard, XML
287	* allows for using a @character_reference.
288	*
289	* Within @elements, content is any string of
290	* characters which does not contain the start-delimiter of
291	* any markup. In a @CDATA section, content is any
292	* string of characters not including the CDATA-section-close
293	* delimiter, "]]>".
294	*
295	* The character @encodings may vary between @external_parsed_entities.
296	*/
297
298	/*
299	*@@gloss: names names
300	* In XML, a "name" is a token beginning with a letter or one of a
301	* few punctuation characters, and continuing with letters,
302	* digits, hyphens, underscores, colons, or full stops,
303	* together known as name characters. The colon has a
304	* special meaning with XML namespaces.
305	*/
306
307	/*
308	*@@gloss: elements elements
309	* Elements are the most common form of XML @markup.
310	* They are identified by their @names.
311	*
312	* As opposed to HTML, there are two types of elements:
313	*
314	* A non-empty element starts and ends with a start-tag
315	* and an end-tag:
316	*
317	+ <LI>...</LI>
318	*
319	* As opposed to HTML, an empty element must have an
320	* empty-element tag:
321	*
322	+ <P /> <IMG align="left" src="http://www.w3.org/Icons/WWW/w3c_home" />
323	*
324	* In addition, @attributes contains extra parameters to elements.
325	* If the element has attributes, they must be in the start-tag
326	* (or empty-element tag).
327	*
328	* For non-empty elements, the text between the start-tag
329	* and end-tag is called the element's content and may
330	* contain other elements, character data, an entity
331	* reference, a @CDATA section, a processing instruction,
332	* or a comment.
333	*
334	* The XML specs break this into "content particles".
335	*
336	* An element has "mixed content" when it may contain
337	* @content, optionally interspersed with child
338	* elements. In this case, the types of the child
339	* elements may be constrained by a documents @DTD, but
340	* not their order or their number of occurrences.
341	*/
342
343	/*
344	*@@gloss: attributes attributes
345	* "Attributes" are name-value pairs that have been associated
346	* with @elements. Attributes can only appear in start-tags
347	* or empty-tags.
348	*
349	* Attributes are identified by their @names. Each such
350	* identifier may only appear once per element.
351	*
352	* As opposed to HTML, attribute values must be quoted (either
353	* in single or double quotes). You may use a @character_reference
354	* to escape quotes in attribute values.
355	*
356	* Example of an attribute:
357	*
358	+ <IMG SRC="mypic.gif" />
359	*
360	* SRC="mypic.gif" is the attribute here.
361	*
362	* There are a few <B>special attributes</B> defined by XML.
363	* In @valid documents, these attributes, like any other,
364	* must be declared if they are used. These attributes are
365	* recursive, i.e. they are considered to apply to all elements
366	* within the content of the element where they are specified,
367	* unless overridden in a sub-element.
368	*
369	* -- "xml:space" may be attached to an element to signal
370	* that @whitespace should be preserved for this element.
371	*
372	* The value "default" signals that applications' default
373	* whitespace processing modes are acceptable for this
374	* element; the value "preserve" indicates the intent that
375	* applications preserve all the white space.
376	*
377	* -- "xml:lang" may be inserted in documents to specify the
378	* language used in the contents and attribute values of
379	* any element in an XML document.
380	*
381	* The value is either a two-letter language code (e.g. "en")
382	* or a combination of language and country code. Interestingly,
383	* the English W3C XML spec gives the following examples:
384	*
385	+ <p xml:lang="en">The quick brown fox jumps over the lazy dog.</p>
386	+ <p xml:lang="en-GB">What colour is it?</p>
387	+ <p xml:lang="en-US">What color is it?</p>
388	+ <sp who="Faust" desc='leise' xml:lang="de">
389	+ <l>Habe nun, ach! Philosophie,</l>
390	+ <l>Juristerei, und Medizin</l>
391	+ <l>und leider auch Theologie</l>
392	+ <l>durchaus studiert mit heiáem Bemh'n.</l>
393	+ </sp>
394	*/
395
396	/*
397	*@@gloss: comments comments
398	* Comments may appear anywhere in a document outside other
399	* markup; in addition, they may appear within the @DTD at
400	* places allowed by the grammar. They are not part of the
401	* document's @content; an XML processor may, but
402	* need not, make it possible for an application to retrieve
403	* the text of comments (@expat has a handler for this).
404	*
405	* Comments may contain any text except "--" (double-hyphen).
406	*
407	* Example of a comment:
408	*
409	+ <!-- declarations for <head> & <body> -->
410	*/
411
412	/*
413	*@@gloss: CDATA CDATA
414	* CDATA sections can appear anywhere where @content
415	* is allowed. They are used to escape blocks of
416	* text containing characters which would otherwise be
417	* recognized as @markup.
418	*
419	* CDATA sections begin with the string <![CDATA[ and end
420	* with the string ]]>. Within a CDATA section, only the
421	* ]]> string is recognized as @markup, so that left angle
422	* brackets and ampersands may occur in their literal form.
423	* They need not (and cannot) be escaped using "&lt;" and
424	* "&amp;". (This implies that not even @comments are
425	* recognized).
426	*
427	* CDATA sections cannot nest.
428	*
429	* Examples:
430	*
431	+ <![CDATA[<greeting>Hello, world!</greeting>]]>
432	+
433	+ <![CDATA[
434	+ *p = &q;
435	+ b = (i <= 3);
436	+ ]]>
437	*/
438
439	/*
440	*@@gloss: processing_instructions processing instructions
441	* "Processing instructions" (PIs) contain additional
442	* data for applications.
443	*
444	* Like @comments, they are not textually part of the XML
445	* document, but the XML processor is required to pass
446	* them to an application.
447	*
448	* PIs have the form:
449	*
450	+ <?name pidata?>
451	*
452	*
453	* The "name", called the PI "target", identifies the PI to
454	* the application. Applications should process only the
455	* targets they recognize and ignore all other PIs. Any
456	* data that follows the PI target is optional, it is for
457	* the application that recognizes the target. The names
458	* used in PIs may be declared in a @notation_declaration in order to
459	* formally identify them.
460	*
461	* PI names beginning with "xml" are reserved.
462	*/
463
464	/*
465	*@@gloss: well-formed well-formed
466	* XML @documents (the sum of all @entities) are "well-formed"
467	* if the following conditions are met (among others):
468	*
469	* -- They contain one or more @elements.
470	*
471	* -- There is exactly one element, called the root, or document
472	* element, no part of which appears in the @content of any
473	* other element.
474	*
475	* -- For all other elements, if the start-tag is in the content
476	* of another element, the end-tag is in the content of the
477	* same element. More simply stated, the elements nest
478	* properly within each other. (This is unlike HTML.)
479	*
480	* -- Values of string @attributes cannot contain references to
481	* @external_entities.
482	*
483	* -- No attribute may appear more than once in the same element.
484	*
485	* -- All entities except the amp, lt, gt, apos, and quot must be
486	* declared before they are used. Binary @external_entities
487	* cannot be referenced in the flow of @content, it can only
488	* be used in an attribute declared as ENTITY or ENTITIES.
489	*
490	* -- Neither text nor @parameter_entities are allowed to be
491	* recursive, directly or indirectly.
492	*/
493
494	/*
495	*@@gloss: valid valid
496	* XML @documents are said to be "valid" if they have a @DTD
497	* associated and they confirm to it. While XML documents
498	* must always be @well-formed, validation and validity is up
499	* to the implementation (i.e. at option to the application).
500	*
501	* Validating processors must report violations of the constraints
502	* expressed by the declarations in the @DTD, and failures to
503	* fulfill the validity constraints given in this specification.
504	* To accomplish this, validating XML processors must read and
505	* process the entire DTD and all @external_parsed_entities
506	* referenced in the document.
507	*
508	* Non-validating processors (such as @expat) are required to
509	* check only the document entity (see @entitites), including the
510	* entire internal DTD subset, for whether it is @well-formed.
511	*
512	* While they are not required to check the document for validity,
513	* they are required to process all the declarations they
514	* read in the internal DTD subset and in any parameter entity
515	* that they read, up to the first reference to a parameter
516	* entity that they do not read; that is to say, they must
517	* use the information in those declarations to normalize
518	* values of @attributes, include the replacement text of
519	* @internal_entities, and supply default attribute values.
520	* They must not process entity declarations or attribute-list
521	* declarations encountered after a reference to a
522	* parameter entity that is not read, since the entity may have
523	* contained overriding declarations.
524	*/
525
526	/*
527	*@@gloss: encodings encodings
528	* XML supports a wide variety of character encodings. These
529	* must be specified in the XML @text_declaration.
530	*
531	* There are too many character encodings on the planet to
532	* be listed here. The most common ones are:
533	*
534	* -- "UTF-8", "UTF-16", "ISO-10646-UCS-2", and "ISO-10646-UCS-4"
535	* should be used for the various encodings and transformations
536	* of Unicode / ISO/IEC 10646.
537	*
538	* -- "ISO-8859-x" (with "x" being a number from 1 to 9) represent
539	* the various ISO 8859 ("Latin") encodings.
540	*
541	* -- "ISO-2022-JP", "Shift_JIS", and "EUC-JP" should be used for
542	* the various encoded forms of JIS X-0208-1997.
543	*
544	* Example of a @text_declaration:
545	*
546	+ <?xml version="1.0" encoding="ISO-8859-2"?>
547	*
548	* All XML processors must be able to read @entities in either
549	* UTF-8 or UTF-16. See XML_SetUnknownEncodingHandler for additional
550	* encodings directly supported by @expat.
551	*
552	* Entities encoded in UTF-16 must begin with the ZERO WIDTH NO-BREAK
553	* SPACE character, #xFEFF). This is an encoding signature, not part
554	* of either the @markup or the @content of the XML @document.
555	* XML processors must be able to use this character to differentiate
556	* between UTF-8 and UTF-16 encoded documents.
557	*/
558
559	/*
560	*@@gloss: text_declaration text declaration
561	* XML @documents and @external_parsed_entities may (and
562	* should) start with the XML text declaration, exactly like
563	* this:
564	*
565	+ <?xml version="1.0" encoding="enc"?>
566	*
567	* where "1.0" is the only currently defined XML version
568	* and "enc" must be the encoding of the document.
569	*
570	* External parsed entities may begin with a text declaration,
571	* which looks like an XML declaration with just an encoding
572	* declaration:
573	*
574	+ <?xml encoding="Big5"?>
575	*
576	* See @encodings.
577	*
578	* Example:
579	*
580	+ <?xml version="1.0" encoding="ISO-8859-1"?>
581	*/
582
583	/*
584	*@@gloss: documents documents
585	* XML documents are made up of storage units called @entities,
586	* which contain either parsed or unparsed data. Parsed data is
587	* made up of characters, some of which form @content,
588	* and some of which form @markup.
589	*
590	* XML documents should start the with the XML @text_declaration.
591	*
592	* The function of the @markup in an XML document is to describe
593	* its storage and logical structure and to associate attribute-value
594	* pairs with its logical structures. XML provides a mechanism,
595	* the document type declaration (@DTD), to define constraints
596	* on the logical structure and to support the use of predefined
597	* storage units.
598	*
599	* A data object is an XML document if it is @well-formed.
600	* A well-formed XML document may in addition be @valid if it
601	* meets certain further constraints.
602	*
603	* A very simple XML document looks like this:
604	*
605	+ <?xml version="1.0"?>
606	+ <oldjoke>
607	+ <burns>Say <quote>goodnight</quote>, Gracie.</burns>
608	+ <allen><quote>Goodnight, Gracie.</quote></allen>
609	+ <applause/>
610	+ </oldjoke>
611	*
612	* This document is @well-formed, but not @valid (because it
613	* has no @DTD).
614	*
615	*/
616
617	/*
618	*@@gloss: element_declaration element declaration
619	* Element declarations identify the @names of elements and the
620	* nature of their content. They look like this:
621	+
622	+ <!ELEMENT name contentspec>
623	+
624	* No element may be declared more than once.
625	*
626	* The "name" of the element is obvious. The "contentspec"
627	* is not. This specifies what may appear in the element
628	* and can be one of the following:
629	*
630	* -- "EMPTY" marks the element as being empty (i.e.
631	* having no content at all).
632	*
633	* -- "ANY" does not impose any restrictions.
634	*
635	* -- (mixed): a "list" which declares the element to have
636	* mixed content. See below.
637	*
638	* -- (children): a "list" which declares the element to
639	* have child elements only, but no content. See below.
640	*
641	* <B>(mixed): content with elements</B>
642	*
643	* With the (mixed) contentspec, an element may either contain
644	* @content only or @content with subelements.
645	*
646	* While the (children) contentspec allows you to define sequences
647	* and orders, this is not possible with (mixed).
648	*
649	* "contentspec" must then be a pair of parentheses, optionally
650	* followed by "*". In the brackets, there must be at least the
651	* keyword "#PCDATA", optionally followed by "\|" and element
652	* names. Note that if no #PCDATA appears, the (children) model
653	* is assumed (see below).
654	*
655	* Examples:
656	*
657	+ <!ELEMENT name (#PCDATA)* >
658	+ <!ELEMENT name (#PCDATA \| subname1 \| subname2)* >
659	+ <!ELEMENT name (#PCDATA) >
660	*
661	* Note that if you specify sub-element names, you must terminate
662	* the contentspec with "*". Again, there's no way to specify
663	* orders etc. with (mixed).
664	*
665	* <B>(children): Element content only</B>
666	*
667	* With the (children) contentspec, an element may contain
668	* only other elements (and @whitespace), but no other @content.
669	*
670	* This can become fairly complicated. "contentspec" then must be
671	* a "list" followed by a "repeater".
672	*
673	* A "repeater" can be:
674	*
675	* -- Nothing: the preceding item _must_ appear exactly once.
676	*
677	* -- "+": the preceding item _must_ appear at _least_ once.
678	*
679	* -- "?": the preceding item _may_ appear exactly once.
680	*
681	* -- "*": the preceding item _may_ appear once or more than
682	* once or not at all.
683	*
684	* Here's the most simple example (precluding that "SUBELEMENT"
685	* is a valid "list" here):
686	*
687	+ <!ELEMENT name (SUBELEMENT)* >
688	*
689	* In other words, in (children) mode, "contentspec" must always
690	* be in brackets and is followed by a "repeater" (which can be
691	* nothing).
692	*
693	* About "lists"... since these declarations may nest, this is
694	* where the recursive definition of a "content particle" comes
695	* in:
696	*
697	* -- A "content particle" is either a sub-element name or
698	* a nested list, followed by a "repeater".
699	*
700	* -- A "list" is defined as an enumeration of content particles,
701	* enclosed in parentheses, where the content particles are
702	* separated by "connectors".
703	*
704	* There are two types of "connectors":
705	*
706	* -- Commas (",") indicate that the elements must appear
707	* in the specified order ("sequence").
708	*
709	* -- Vertical bars ("\|") specify that the elements may
710	* occur alternatively ("choice").
711	*
712	* The connectors cannot be mixed; the list must be
713	* either completely "sequence" or "choice".
714	*
715	* Examples of content particles:
716	*
717	+ SUBELEMENT+
718	+ list*
719	*
720	* Examples of lists:
721	*
722	+ ( cp \| cp \| cp \| cp )
723	+ ( cp , cp , cp , cp )
724	*
725	* Full examples for (children):
726	*
727	+ <!ELEMENT oldjoke ( burns+, allen, applause? ) >
728	+ \| \| +cp-+ \| \|
729	+ \| \| \| \|
730	+ \| +------- list ---------+ \|
731	+ +-------contentspec--------+
732	*
733	* This specifies a "seqlist" for the "oldjoke" element. The
734	* list is not nested, so the content particles are element
735	* names only.
736	*
737	* Within "oldjoke", "burns" must appear first and can appear
738	* once or several times.
739	*
740	* Next must be "allen", exactly once (since there's no repeater).
741	*
742	* Optionally ("?"), there can be "applause" at the end.
743	*
744	* Now, a nested example:
745	*
746	+ <!ELEMENT poem (title?, (stanza+ \| couplet+ \| line+) ) >
747	*
748	* That is, a poem consists of an optional title, followed by one or
749	* several stanzas, or one or several couplets, or one or several lines.
750	* This is different from:
751	*
752	+ <!ELEMENT poem (title?, (stanza \| couplet \| line)+ ) >
753	*
754	* The latter allows for a single poem to contain a mixture of stanzas,
755	* couplets or lines.
756	*
757	* And for WarpIN:
758	*
759	+ <!ELEMENT WARPIN (REXX, VARPROMPT, MSG?, TITLE?, (GROUP \| PCK)+), PAGE+) >
760	*
761	*/
762
763	/*
764	*@@gloss: attribute_declaration attribute declaration
765	* Attribute declarations identify the @names of attributes
766	* of @elements and their possible values. They look like this:
767	*
768	+ <!ATTLIST elementname
769	+ attname atttype defaultvalue
770	+ attname atttype defaultvalue
771	+ ... >
772	*
773	* "elementname" is the element name for which the
774	* attributes are being defined.
775	*
776	* For each attribute, you must then specify three
777	* columns:
778	*
779	* -- "attname" is the attribute name.
780	*
781	* -- "atttype" is the attribute type (one of six values,
782	* see below).
783	*
784	* -- "defaultvalue" specifies the default value.
785	*
786	* The attribute type (specifying the value type) must be
787	* one of six:
788	*
789	* -- "CDATA" is any character data. (This has nothing to
790	* do with @CDATA sections.)
791	*
792	* -- "ID": the value must be a unique @name among the
793	* document. Only one such attribute is allowed per
794	* element.
795	*
796	* -- "IDREF" or "IDREFS": a reference to some other
797	* element which has an "ID" attribute with this value.
798	* "IDREFS" is the plural and may contain several of
799	* those separated by @whitespace.
800	*
801	* -- "ENTITY" or "ENTITIES": a reference to some an
802	* external entity (see @external_entities).
803	* "ENTITIES" is the plural and may contain several of
804	* those separated by @whitespace.
805	*
806	* -- "NMTOKEN" or "NMTOKENS": a single-word string.
807	* This is not a reference though.
808	* "NMTOKENS" is the plural and may contain several of
809	* those separated by @whitespace.
810	*
811	* -- an enumeration: an explicit list of allowed
812	* values for this attribute. Additionally, you can specify
813	* that the names must match a particular @notation_declaration.
814	*
815	* The "defaultvalue" (third column) can be one of these:
816	*
817	* -- "#REQUIRED": the attribute may not be omitted.
818	*
819	* -- "#IMPLIED": the attribute is optional, and there's
820	* no default value.
821	*
822	* -- "'value'": the attribute is optional, and it has
823	* this default.
824	*
825	* -- "#FIXED 'value'": the attribute is optional, but if
826	* it appears, it must have this value.
827	*
828	* Example:
829	*
830	+ <!ATTLIST oldjoke
831	+ name ID #REQUIRED
832	+ label CDATA #IMPLIED
833	+ status ( funny \| notfunny ) 'funny'>
834	*/
835
836	/*
837	*@@gloss: entity_declaration entity declaration
838	* Entity declarations define @entities.
839	*
840	* An example of @internal_entities:
841	*
842	+ <!ENTITY ATI "ArborText, Inc.">
843	*
844	* Examples of @external_entities:
845	*
846	+ <!ENTITY boilerplate SYSTEM "/standard/legalnotice.xml">
847	+ <!ENTITY ATIlogo SYSTEM "/standard/logo.gif" NDATA GIF87A>
848	*/
849
850	/*
851	*@@gloss: notation_declaration notation declaration
852	* Notation declarations identify specific types of external
853	* binary data. This information is passed to the processing
854	* application, which may make whatever use of it it wishes.
855	*
856	* Example:
857	*
858	+ <!NOTATION GIF87A SYSTEM "GIF">
859	*/
860
861	/*
862	*@@gloss: DTD DTD
863	* The XML document type declaration contains or points to
864	* markup declarations that provide a grammar for a class of @documents.
865	* This grammar is known as a Document Type Definition, or DTD.
866	*
867	* The DTD must look like the following:
868	*
869	+ <!DOCTYPE name ... >
870	*
871	* "name" must match the document's root element.
872	*
873	* "..." can be the reference to an external subset (being a special
874	* case of @external_entities):
875	*
876	+ <!DOCTYPE name SYSTEM "whatever.dtd">
877	*
878	* or an internal subset in brackets, which contains the markup
879	* directly:
880	*
881	+ <!DOCTYPE name [
882	+ <!ELEMENT greeting (#PCDATA)>
883	+ ]>
884	*
885	* You can even mix both.
886	*
887	* A markup declaration is either an @element_declaration, an
888	* @attribute_declaration, an @entity_declaration,
889	* or a @notation_declaration. These declarations may be contained
890	* in whole or in part within @parameter_entities.
891	*/
892
893	/*
894	*@@gloss: DOM DOM
895	* DOM is the "Document Object Model", as defined by the W3C.
896	*
897	* The DOM is a programming interface for @XML @documents.
898	* (XML is a metalanguage and describes the documents
899	* themselves. DOM is a programming interface -- an API --
900	* to access XML documents.)
901	*
902	* The W3C calls this "a platform- and language-neutral
903	* interface that allows programs and scripts to dynamically
904	* access and update the content, structure and style of
905	* documents. The Document Object Model provides
906	* a standard set of objects for representing HTML and XML
907	* documents, a standard model of how these objects can
908	* be combined, and a standard interface for accessing and
909	* manipulating them. Vendors can support the DOM as an
910	* interface to their proprietary data structures and APIs,
911	* and content authors can write to the standard DOM
912	* interfaces rather than product-specific APIs, thus
913	* increasing interoperability on the Web."
914	*
915	* In short, DOM specifies that an XML document is broken
916	* up into a tree of "nodes", representing the various parts
917	* of an XML document. Such nodes represent @documents,
918	* @elements, @attributes, @processing_instructions,
919	* @comments, @content, and more.
920	*
921	* See xml.c for an introduction to XML and DOM support in
922	* the XWorkplace helpers.
923	*
924	* Example: Take this HTML table definition:
925	+
926	+ <TABLE>
927	+ <TBODY>
928	+ <TR>
929	+ <TD>Column 1-1</TD>
930	+ <TD>Column 1-2</TD>
931	+ </TR>
932	+ <TR>
933	+ <TD>Column 2-1</TD>
934	+ <TD>Column 2-2</TD>
935	+ </TR>
936	+ </TBODY>
937	+ </TABLE>
938	*
939	* In the DOM, this would be represented by a tree as follows:
940	+
941	+ ÚÄÄÄÄÄÄÄÄÄÄÄÄ¿
942	+ ³ TABLE ³ (only ELEMENT node in root DOCUMENT node)
943	+ ÀÄÄÄÄÄÂÄÄÄÄÄÄÙ
944	+ ³
945	+ ÚÄÄÄÄÄÁÄÄÄÄÄÄ¿
946	+ ³ TBODY ³ (only ELEMENT node in root "TABLE" node)
947	+ ÀÄÄÄÄÄÂÄÄÄÄÄÄÙ
948	+ ÚÄÄÄÄÄÄÄÄÄÄÄÁÄÄÄÄÄÄÄÄÄÄÄ¿
949	+ ÚÄÄÄÄÄÁÄÄÄÄÄÄ¿ ÚÄÄÄÄÄÁÄÄÄÄÄÄ¿
950	+ ³ TR ³ ³ TR ³
951	+ ÀÄÄÄÄÄÂÄÄÄÄÄÄÙ ÀÄÄÄÄÄÂÄÄÄÄÄÄÙ
952	+ ÚÄÄÄÁÄÄÄÄÄÄ¿ ÚÄÄÄÁÄÄÄÄÄÄ¿
953	+ ÚÄÄÄÁÄ¿ ÚÄÄÁÄÄ¿ ÚÄÄÄÁÄ¿ ÚÄÄÁÄÄ¿
954	+ ³ TD ³ ³ TD ³ ³ TD ³ ³ TD ³
955	+ ÀÄÄÂÄÄÙ ÀÄÄÂÄÄÙ ÀÄÄÄÂÄÙ ÀÄÄÂÄÄÙ
956	+ ÉÍÍÍÍÍÊÍÍÍÍ» ÉÍÍÍÍÊÍÍÍÍÍ» ÉÍÍÍÍÊÍÍÍÍÍ» ÉÍÍÊÍÍÍÍÍÍÍ»
957	+ ºColumn 1-1º ºColumn 1-2º ºColumn 2-1º ºColumn 2-2º (one TEXT node in each parent node)
958	+ ÈÍÍÍÍÍÍÍÍÍÍŒ ÈÍÍÍÍÍÍÍÍÍÍŒ ÈÍÍÍÍÍÍÍÍÍÍŒ ÈÍÍÍÍÍÍÍÍÍÍŒ
959	*/
960
961	/*
962	*@@gloss: DOM_DOCUMENT DOCUMENT
963	* representation of XML @documents in the @DOM.
964	*
965	* The xwphelpers implementation has the following differences
966	* to the DOM specs:
967	*
968	* -- The "doctype" member points to the documents @DTD, or is NULL.
969	* In our implementation, this is the pvExtra pointer, which points
970	* to a _DOMDTD.
971	*
972	* -- The "implementation" member points to a DOMImplementation object.
973	* This is not supported here.
974	*
975	* -- The "documentElement" member is a convenience pointer to the
976	* document's root element. We don't supply this field; instead,
977	* the llChildren list only contains a single ELEMENT node for the
978	* root element.
979	*
980	* -- The "createElement" method is implemented by xmlCreateElementNode.
981	*
982	* -- The "createAttribute" method is implemented by xmlCreateAttributeNode.
983	*
984	* -- The "createTextNode" method is implemented by xmlCreateTextNode,
985	* which has an extra parameter though.
986	*
987	* -- The "createComment" method is implemented by xmlCreateCommentNode.
988	*
989	* -- The "createProcessingInstruction" method is implemented by
990	* xmlCreatePINode.
991	*
992	* -- The "createDocumentFragment", "createCDATASection", and
993	* "createEntityReference" methods are not supported.
994	*/
995
996

Note: See TracBrowser for help on using the repository browser.

Download in other formats: