[2] | 1 |
|
---|
| 2 | :mod:`xml.sax` --- Support for SAX2 parsers
|
---|
| 3 | ===========================================
|
---|
| 4 |
|
---|
| 5 | .. module:: xml.sax
|
---|
| 6 | :synopsis: Package containing SAX2 base classes and convenience functions.
|
---|
| 7 | .. moduleauthor:: Lars Marius Garshol <larsga@garshol.priv.no>
|
---|
| 8 | .. sectionauthor:: Fred L. Drake, Jr. <fdrake@acm.org>
|
---|
| 9 | .. sectionauthor:: Martin v. Löwis <martin@v.loewis.de>
|
---|
| 10 |
|
---|
| 11 |
|
---|
| 12 | .. versionadded:: 2.0
|
---|
| 13 |
|
---|
| 14 | The :mod:`xml.sax` package provides a number of modules which implement the
|
---|
| 15 | Simple API for XML (SAX) interface for Python. The package itself provides the
|
---|
| 16 | SAX exceptions and the convenience functions which will be most used by users of
|
---|
| 17 | the SAX API.
|
---|
| 18 |
|
---|
[391] | 19 |
|
---|
| 20 | .. warning::
|
---|
| 21 |
|
---|
| 22 | The :mod:`xml.sax` module is not secure against maliciously
|
---|
| 23 | constructed data. If you need to parse untrusted or unauthenticated data see
|
---|
| 24 | :ref:`xml-vulnerabilities`.
|
---|
| 25 |
|
---|
| 26 |
|
---|
[2] | 27 | The convenience functions are:
|
---|
| 28 |
|
---|
| 29 |
|
---|
| 30 | .. function:: make_parser([parser_list])
|
---|
| 31 |
|
---|
[391] | 32 | Create and return a SAX :class:`~xml.sax.xmlreader.XMLReader` object. The
|
---|
| 33 | first parser found will
|
---|
[2] | 34 | be used. If *parser_list* is provided, it must be a sequence of strings which
|
---|
| 35 | name modules that have a function named :func:`create_parser`. Modules listed
|
---|
| 36 | in *parser_list* will be used before modules in the default list of parsers.
|
---|
| 37 |
|
---|
| 38 |
|
---|
| 39 | .. function:: parse(filename_or_stream, handler[, error_handler])
|
---|
| 40 |
|
---|
| 41 | Create a SAX parser and use it to parse a document. The document, passed in as
|
---|
| 42 | *filename_or_stream*, can be a filename or a file object. The *handler*
|
---|
[391] | 43 | parameter needs to be a SAX :class:`~handler.ContentHandler` instance. If
|
---|
| 44 | *error_handler* is given, it must be a SAX :class:`~handler.ErrorHandler`
|
---|
| 45 | instance; if
|
---|
[2] | 46 | omitted, :exc:`SAXParseException` will be raised on all errors. There is no
|
---|
| 47 | return value; all work must be done by the *handler* passed in.
|
---|
| 48 |
|
---|
| 49 |
|
---|
| 50 | .. function:: parseString(string, handler[, error_handler])
|
---|
| 51 |
|
---|
| 52 | Similar to :func:`parse`, but parses from a buffer *string* received as a
|
---|
| 53 | parameter.
|
---|
| 54 |
|
---|
| 55 | A typical SAX application uses three kinds of objects: readers, handlers and
|
---|
| 56 | input sources. "Reader" in this context is another term for parser, i.e. some
|
---|
| 57 | piece of code that reads the bytes or characters from the input source, and
|
---|
| 58 | produces a sequence of events. The events then get distributed to the handler
|
---|
| 59 | objects, i.e. the reader invokes a method on the handler. A SAX application
|
---|
| 60 | must therefore obtain a reader object, create or open the input sources, create
|
---|
| 61 | the handlers, and connect these objects all together. As the final step of
|
---|
| 62 | preparation, the reader is called to parse the input. During parsing, methods on
|
---|
| 63 | the handler objects are called based on structural and syntactic events from the
|
---|
| 64 | input data.
|
---|
| 65 |
|
---|
| 66 | For these objects, only the interfaces are relevant; they are normally not
|
---|
| 67 | instantiated by the application itself. Since Python does not have an explicit
|
---|
| 68 | notion of interface, they are formally introduced as classes, but applications
|
---|
| 69 | may use implementations which do not inherit from the provided classes. The
|
---|
[391] | 70 | :class:`~xml.sax.xmlreader.InputSource`, :class:`~xml.sax.xmlreader.Locator`,
|
---|
| 71 | :class:`~xml.sax.xmlreader.Attributes`, :class:`~xml.sax.xmlreader.AttributesNS`,
|
---|
| 72 | and :class:`~xml.sax.xmlreader.XMLReader` interfaces are defined in the
|
---|
[2] | 73 | module :mod:`xml.sax.xmlreader`. The handler interfaces are defined in
|
---|
[391] | 74 | :mod:`xml.sax.handler`. For convenience,
|
---|
| 75 | :class:`~xml.sax.xmlreader.InputSource` (which is often
|
---|
[2] | 76 | instantiated directly) and the handler classes are also available from
|
---|
| 77 | :mod:`xml.sax`. These interfaces are described below.
|
---|
| 78 |
|
---|
| 79 | In addition to these classes, :mod:`xml.sax` provides the following exception
|
---|
| 80 | classes.
|
---|
| 81 |
|
---|
| 82 |
|
---|
| 83 | .. exception:: SAXException(msg[, exception])
|
---|
| 84 |
|
---|
| 85 | Encapsulate an XML error or warning. This class can contain basic error or
|
---|
| 86 | warning information from either the XML parser or the application: it can be
|
---|
| 87 | subclassed to provide additional functionality or to add localization. Note
|
---|
[391] | 88 | that although the handlers defined in the
|
---|
| 89 | :class:`~xml.sax.handler.ErrorHandler` interface
|
---|
[2] | 90 | receive instances of this exception, it is not required to actually raise the
|
---|
| 91 | exception --- it is also useful as a container for information.
|
---|
| 92 |
|
---|
| 93 | When instantiated, *msg* should be a human-readable description of the error.
|
---|
| 94 | The optional *exception* parameter, if given, should be ``None`` or an exception
|
---|
| 95 | that was caught by the parsing code and is being passed along as information.
|
---|
| 96 |
|
---|
| 97 | This is the base class for the other SAX exception classes.
|
---|
| 98 |
|
---|
| 99 |
|
---|
| 100 | .. exception:: SAXParseException(msg, exception, locator)
|
---|
| 101 |
|
---|
[391] | 102 | Subclass of :exc:`SAXException` raised on parse errors. Instances of this
|
---|
| 103 | class are passed to the methods of the SAX
|
---|
| 104 | :class:`~xml.sax.handler.ErrorHandler` interface to provide information
|
---|
| 105 | about the parse error. This class supports the SAX
|
---|
| 106 | :class:`~xml.sax.xmlreader.Locator` interface as well as the
|
---|
| 107 | :class:`SAXException` interface.
|
---|
[2] | 108 |
|
---|
| 109 |
|
---|
| 110 | .. exception:: SAXNotRecognizedException(msg[, exception])
|
---|
| 111 |
|
---|
[391] | 112 | Subclass of :exc:`SAXException` raised when a SAX
|
---|
| 113 | :class:`~xml.sax.xmlreader.XMLReader` is
|
---|
[2] | 114 | confronted with an unrecognized feature or property. SAX applications and
|
---|
| 115 | extensions may use this class for similar purposes.
|
---|
| 116 |
|
---|
| 117 |
|
---|
| 118 | .. exception:: SAXNotSupportedException(msg[, exception])
|
---|
| 119 |
|
---|
[391] | 120 | Subclass of :exc:`SAXException` raised when a SAX
|
---|
| 121 | :class:`~xml.sax.xmlreader.XMLReader` is asked to
|
---|
[2] | 122 | enable a feature that is not supported, or to set a property to a value that the
|
---|
| 123 | implementation does not support. SAX applications and extensions may use this
|
---|
| 124 | class for similar purposes.
|
---|
| 125 |
|
---|
| 126 |
|
---|
| 127 | .. seealso::
|
---|
| 128 |
|
---|
| 129 | `SAX: The Simple API for XML <http://www.saxproject.org/>`_
|
---|
| 130 | This site is the focal point for the definition of the SAX API. It provides a
|
---|
| 131 | Java implementation and online documentation. Links to implementations and
|
---|
| 132 | historical information are also available.
|
---|
| 133 |
|
---|
| 134 | Module :mod:`xml.sax.handler`
|
---|
| 135 | Definitions of the interfaces for application-provided objects.
|
---|
| 136 |
|
---|
| 137 | Module :mod:`xml.sax.saxutils`
|
---|
| 138 | Convenience functions for use in SAX applications.
|
---|
| 139 |
|
---|
| 140 | Module :mod:`xml.sax.xmlreader`
|
---|
| 141 | Definitions of the interfaces for parser-provided objects.
|
---|
| 142 |
|
---|
| 143 |
|
---|
| 144 | .. _sax-exception-objects:
|
---|
| 145 |
|
---|
| 146 | SAXException Objects
|
---|
| 147 | --------------------
|
---|
| 148 |
|
---|
| 149 | The :class:`SAXException` exception class supports the following methods:
|
---|
| 150 |
|
---|
| 151 |
|
---|
| 152 | .. method:: SAXException.getMessage()
|
---|
| 153 |
|
---|
| 154 | Return a human-readable message describing the error condition.
|
---|
| 155 |
|
---|
| 156 |
|
---|
| 157 | .. method:: SAXException.getException()
|
---|
| 158 |
|
---|
| 159 | Return an encapsulated exception object, or ``None``.
|
---|
| 160 |
|
---|