1 | \section{\module{xml.sax} ---
|
---|
2 | Support for SAX2 parsers}
|
---|
3 |
|
---|
4 | \declaremodule{standard}{xml.sax}
|
---|
5 | \modulesynopsis{Package containing SAX2 base classes and convenience
|
---|
6 | functions.}
|
---|
7 | \moduleauthor{Lars Marius Garshol}{larsga@garshol.priv.no}
|
---|
8 | \sectionauthor{Fred L. Drake, Jr.}{fdrake@acm.org}
|
---|
9 | \sectionauthor{Martin v. L\"owis}{martin@v.loewis.de}
|
---|
10 |
|
---|
11 | \versionadded{2.0}
|
---|
12 |
|
---|
13 |
|
---|
14 | The \module{xml.sax} package provides a number of modules which
|
---|
15 | implement the Simple API for XML (SAX) interface for Python. The
|
---|
16 | package itself provides the SAX exceptions and the convenience
|
---|
17 | functions which will be most used by users of the SAX API.
|
---|
18 |
|
---|
19 | The convenience functions are:
|
---|
20 |
|
---|
21 | \begin{funcdesc}{make_parser}{\optional{parser_list}}
|
---|
22 | Create and return a SAX \class{XMLReader} object. The first parser
|
---|
23 | found will be used. If \var{parser_list} is provided, it must be a
|
---|
24 | sequence of strings which name modules that have a function named
|
---|
25 | \function{create_parser()}. Modules listed in \var{parser_list}
|
---|
26 | will be used before modules in the default list of parsers.
|
---|
27 | \end{funcdesc}
|
---|
28 |
|
---|
29 | \begin{funcdesc}{parse}{filename_or_stream, handler\optional{, error_handler}}
|
---|
30 | Create a SAX parser and use it to parse a document. The document,
|
---|
31 | passed in as \var{filename_or_stream}, can be a filename or a file
|
---|
32 | object. The \var{handler} parameter needs to be a SAX
|
---|
33 | \class{ContentHandler} instance. If \var{error_handler} is given,
|
---|
34 | it must be a SAX \class{ErrorHandler} instance; if omitted,
|
---|
35 | \exception{SAXParseException} will be raised on all errors. There
|
---|
36 | is no return value; all work must be done by the \var{handler}
|
---|
37 | passed in.
|
---|
38 | \end{funcdesc}
|
---|
39 |
|
---|
40 | \begin{funcdesc}{parseString}{string, handler\optional{, error_handler}}
|
---|
41 | Similar to \function{parse()}, but parses from a buffer \var{string}
|
---|
42 | received as a parameter.
|
---|
43 | \end{funcdesc}
|
---|
44 |
|
---|
45 | A typical SAX application uses three kinds of objects: readers,
|
---|
46 | handlers and input sources. ``Reader'' in this context is another
|
---|
47 | term for parser, i.e.\ some piece of code that reads the bytes or
|
---|
48 | characters from the input source, and produces a sequence of events.
|
---|
49 | The events then get distributed to the handler objects, i.e.\ the
|
---|
50 | reader invokes a method on the handler. A SAX application must
|
---|
51 | therefore obtain a reader object, create or open the input sources,
|
---|
52 | create the handlers, and connect these objects all together. As the
|
---|
53 | final step of preparation, the reader is called to parse the input.
|
---|
54 | During parsing, methods on the handler objects are called based on
|
---|
55 | structural and syntactic events from the input data.
|
---|
56 |
|
---|
57 | For these objects, only the interfaces are relevant; they are normally
|
---|
58 | not instantiated by the application itself. Since Python does not have
|
---|
59 | an explicit notion of interface, they are formally introduced as
|
---|
60 | classes, but applications may use implementations which do not inherit
|
---|
61 | from the provided classes. The \class{InputSource}, \class{Locator},
|
---|
62 | \class{Attributes}, \class{AttributesNS}, and
|
---|
63 | \class{XMLReader} interfaces are defined in the module
|
---|
64 | \refmodule{xml.sax.xmlreader}. The handler interfaces are defined in
|
---|
65 | \refmodule{xml.sax.handler}. For convenience, \class{InputSource}
|
---|
66 | (which is often instantiated directly) and the handler classes are
|
---|
67 | also available from \module{xml.sax}. These interfaces are described
|
---|
68 | below.
|
---|
69 |
|
---|
70 | In addition to these classes, \module{xml.sax} provides the following
|
---|
71 | exception classes.
|
---|
72 |
|
---|
73 | \begin{excclassdesc}{SAXException}{msg\optional{, exception}}
|
---|
74 | Encapsulate an XML error or warning. This class can contain basic
|
---|
75 | error or warning information from either the XML parser or the
|
---|
76 | application: it can be subclassed to provide additional
|
---|
77 | functionality or to add localization. Note that although the
|
---|
78 | handlers defined in the \class{ErrorHandler} interface receive
|
---|
79 | instances of this exception, it is not required to actually raise
|
---|
80 | the exception --- it is also useful as a container for information.
|
---|
81 |
|
---|
82 | When instantiated, \var{msg} should be a human-readable description
|
---|
83 | of the error. The optional \var{exception} parameter, if given,
|
---|
84 | should be \code{None} or an exception that was caught by the parsing
|
---|
85 | code and is being passed along as information.
|
---|
86 |
|
---|
87 | This is the base class for the other SAX exception classes.
|
---|
88 | \end{excclassdesc}
|
---|
89 |
|
---|
90 | \begin{excclassdesc}{SAXParseException}{msg, exception, locator}
|
---|
91 | Subclass of \exception{SAXException} raised on parse errors.
|
---|
92 | Instances of this class are passed to the methods of the SAX
|
---|
93 | \class{ErrorHandler} interface to provide information about the
|
---|
94 | parse error. This class supports the SAX \class{Locator} interface
|
---|
95 | as well as the \class{SAXException} interface.
|
---|
96 | \end{excclassdesc}
|
---|
97 |
|
---|
98 | \begin{excclassdesc}{SAXNotRecognizedException}{msg\optional{, exception}}
|
---|
99 | Subclass of \exception{SAXException} raised when a SAX
|
---|
100 | \class{XMLReader} is confronted with an unrecognized feature or
|
---|
101 | property. SAX applications and extensions may use this class for
|
---|
102 | similar purposes.
|
---|
103 | \end{excclassdesc}
|
---|
104 |
|
---|
105 | \begin{excclassdesc}{SAXNotSupportedException}{msg\optional{, exception}}
|
---|
106 | Subclass of \exception{SAXException} raised when a SAX
|
---|
107 | \class{XMLReader} is asked to enable a feature that is not
|
---|
108 | supported, or to set a property to a value that the implementation
|
---|
109 | does not support. SAX applications and extensions may use this
|
---|
110 | class for similar purposes.
|
---|
111 | \end{excclassdesc}
|
---|
112 |
|
---|
113 |
|
---|
114 | \begin{seealso}
|
---|
115 | \seetitle[http://www.saxproject.org/]{SAX: The Simple API for
|
---|
116 | XML}{This site is the focal point for the definition of
|
---|
117 | the SAX API. It provides a Java implementation and online
|
---|
118 | documentation. Links to implementations and historical
|
---|
119 | information are also available.}
|
---|
120 |
|
---|
121 | \seemodule{xml.sax.handler}{Definitions of the interfaces for
|
---|
122 | application-provided objects.}
|
---|
123 |
|
---|
124 | \seemodule{xml.sax.saxutils}{Convenience functions for use in SAX
|
---|
125 | applications.}
|
---|
126 |
|
---|
127 | \seemodule{xml.sax.xmlreader}{Definitions of the interfaces for
|
---|
128 | parser-provided objects.}
|
---|
129 | \end{seealso}
|
---|
130 |
|
---|
131 |
|
---|
132 | \subsection{SAXException Objects \label{sax-exception-objects}}
|
---|
133 |
|
---|
134 | The \class{SAXException} exception class supports the following
|
---|
135 | methods:
|
---|
136 |
|
---|
137 | \begin{methoddesc}[SAXException]{getMessage}{}
|
---|
138 | Return a human-readable message describing the error condition.
|
---|
139 | \end{methoddesc}
|
---|
140 |
|
---|
141 | \begin{methoddesc}[SAXException]{getException}{}
|
---|
142 | Return an encapsulated exception object, or \code{None}.
|
---|
143 | \end{methoddesc}
|
---|