Changeset 391 for python/trunk/Doc/library/xml.etree.elementtree.rst
- Timestamp:
- Mar 19, 2014, 11:31:01 PM (11 years ago)
- Location:
- python/trunk
- Files:
-
- 2 edited
Legend:
- Unmodified
- Added
- Removed
-
python/trunk
-
Property svn:mergeinfo
set to
/python/vendor/Python-2.7.6 merged eligible /python/vendor/current merged eligible
-
Property svn:mergeinfo
set to
-
python/trunk/Doc/library/xml.etree.elementtree.rst
r2 r391 1 2 1 :mod:`xml.etree.ElementTree` --- The ElementTree XML API 3 2 ======================================================== … … 10 9 .. versionadded:: 2.5 11 10 12 The Element type is a flexible container object, designed to store hierarchical 13 data structures in memory. The type can be described as a cross between a list 14 and a dictionary. 11 **Source code:** :source:`Lib/xml/etree/ElementTree.py` 12 13 -------------- 14 15 The :class:`Element` type is a flexible container object, designed to store 16 hierarchical data structures in memory. The type can be described as a cross 17 between a list and a dictionary. 18 19 20 .. warning:: 21 22 The :mod:`xml.etree.ElementTree` module is not secure against 23 maliciously constructed data. If you need to parse untrusted or 24 unauthenticated data see :ref:`xml-vulnerabilities`. 25 15 26 16 27 Each element has a number of properties associated with it: … … 27 38 * a number of child elements, stored in a Python sequence 28 39 29 To create an element instance, use the Element or SubElement factory functions. 40 To create an element instance, use the :class:`Element` constructor or the 41 :func:`SubElement` factory function. 30 42 31 43 The :class:`ElementTree` class can be used to wrap an element structure, and … … 35 47 36 48 See http://effbot.org/zone/element-index.htm for tutorials and links to other 37 docs. Fredrik Lundh's page is also the location of the development version of the 38 xml.etree.ElementTree. 49 docs. Fredrik Lundh's page is also the location of the development version of 50 the xml.etree.ElementTree. 51 52 .. versionchanged:: 2.7 53 The ElementTree API is updated to 1.3. For more information, see 54 `Introducing ElementTree 1.3 55 <http://effbot.org/zone/elementtree-13-intro.htm>`_. 56 57 Tutorial 58 -------- 59 60 This is a short tutorial for using :mod:`xml.etree.ElementTree` (``ET`` in 61 short). The goal is to demonstrate some of the building blocks and basic 62 concepts of the module. 63 64 XML tree and elements 65 ^^^^^^^^^^^^^^^^^^^^^ 66 67 XML is an inherently hierarchical data format, and the most natural way to 68 represent it is with a tree. ``ET`` has two classes for this purpose - 69 :class:`ElementTree` represents the whole XML document as a tree, and 70 :class:`Element` represents a single node in this tree. Interactions with 71 the whole document (reading and writing to/from files) are usually done 72 on the :class:`ElementTree` level. Interactions with a single XML element 73 and its sub-elements are done on the :class:`Element` level. 74 75 .. _elementtree-parsing-xml: 76 77 Parsing XML 78 ^^^^^^^^^^^ 79 80 We'll be using the following XML document as the sample data for this section: 81 82 .. code-block:: xml 83 84 <?xml version="1.0"?> 85 <data> 86 <country name="Liechtenstein"> 87 <rank>1</rank> 88 <year>2008</year> 89 <gdppc>141100</gdppc> 90 <neighbor name="Austria" direction="E"/> 91 <neighbor name="Switzerland" direction="W"/> 92 </country> 93 <country name="Singapore"> 94 <rank>4</rank> 95 <year>2011</year> 96 <gdppc>59900</gdppc> 97 <neighbor name="Malaysia" direction="N"/> 98 </country> 99 <country name="Panama"> 100 <rank>68</rank> 101 <year>2011</year> 102 <gdppc>13600</gdppc> 103 <neighbor name="Costa Rica" direction="W"/> 104 <neighbor name="Colombia" direction="E"/> 105 </country> 106 </data> 107 108 We have a number of ways to import the data. Reading the file from disk:: 109 110 import xml.etree.ElementTree as ET 111 tree = ET.parse('country_data.xml') 112 root = tree.getroot() 113 114 Reading the data from a string:: 115 116 root = ET.fromstring(country_data_as_string) 117 118 :func:`fromstring` parses XML from a string directly into an :class:`Element`, 119 which is the root element of the parsed tree. Other parsing functions may 120 create an :class:`ElementTree`. Check the documentation to be sure. 121 122 As an :class:`Element`, ``root`` has a tag and a dictionary of attributes:: 123 124 >>> root.tag 125 'data' 126 >>> root.attrib 127 {} 128 129 It also has children nodes over which we can iterate:: 130 131 >>> for child in root: 132 ... print child.tag, child.attrib 133 ... 134 country {'name': 'Liechtenstein'} 135 country {'name': 'Singapore'} 136 country {'name': 'Panama'} 137 138 Children are nested, and we can access specific child nodes by index:: 139 140 >>> root[0][1].text 141 '2008' 142 143 Finding interesting elements 144 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 145 146 :class:`Element` has some useful methods that help iterate recursively over all 147 the sub-tree below it (its children, their children, and so on). For example, 148 :meth:`Element.iter`:: 149 150 >>> for neighbor in root.iter('neighbor'): 151 ... print neighbor.attrib 152 ... 153 {'name': 'Austria', 'direction': 'E'} 154 {'name': 'Switzerland', 'direction': 'W'} 155 {'name': 'Malaysia', 'direction': 'N'} 156 {'name': 'Costa Rica', 'direction': 'W'} 157 {'name': 'Colombia', 'direction': 'E'} 158 159 :meth:`Element.findall` finds only elements with a tag which are direct 160 children of the current element. :meth:`Element.find` finds the *first* child 161 with a particular tag, and :attr:`Element.text` accesses the element's text 162 content. :meth:`Element.get` accesses the element's attributes:: 163 164 >>> for country in root.findall('country'): 165 ... rank = country.find('rank').text 166 ... name = country.get('name') 167 ... print name, rank 168 ... 169 Liechtenstein 1 170 Singapore 4 171 Panama 68 172 173 More sophisticated specification of which elements to look for is possible by 174 using :ref:`XPath <elementtree-xpath>`. 175 176 Modifying an XML File 177 ^^^^^^^^^^^^^^^^^^^^^ 178 179 :class:`ElementTree` provides a simple way to build XML documents and write them to files. 180 The :meth:`ElementTree.write` method serves this purpose. 181 182 Once created, an :class:`Element` object may be manipulated by directly changing 183 its fields (such as :attr:`Element.text`), adding and modifying attributes 184 (:meth:`Element.set` method), as well as adding new children (for example 185 with :meth:`Element.append`). 186 187 Let's say we want to add one to each country's rank, and add an ``updated`` 188 attribute to the rank element:: 189 190 >>> for rank in root.iter('rank'): 191 ... new_rank = int(rank.text) + 1 192 ... rank.text = str(new_rank) 193 ... rank.set('updated', 'yes') 194 ... 195 >>> tree.write('output.xml') 196 197 Our XML now looks like this: 198 199 .. code-block:: xml 200 201 <?xml version="1.0"?> 202 <data> 203 <country name="Liechtenstein"> 204 <rank updated="yes">2</rank> 205 <year>2008</year> 206 <gdppc>141100</gdppc> 207 <neighbor name="Austria" direction="E"/> 208 <neighbor name="Switzerland" direction="W"/> 209 </country> 210 <country name="Singapore"> 211 <rank updated="yes">5</rank> 212 <year>2011</year> 213 <gdppc>59900</gdppc> 214 <neighbor name="Malaysia" direction="N"/> 215 </country> 216 <country name="Panama"> 217 <rank updated="yes">69</rank> 218 <year>2011</year> 219 <gdppc>13600</gdppc> 220 <neighbor name="Costa Rica" direction="W"/> 221 <neighbor name="Colombia" direction="E"/> 222 </country> 223 </data> 224 225 We can remove elements using :meth:`Element.remove`. Let's say we want to 226 remove all countries with a rank higher than 50:: 227 228 >>> for country in root.findall('country'): 229 ... rank = int(country.find('rank').text) 230 ... if rank > 50: 231 ... root.remove(country) 232 ... 233 >>> tree.write('output.xml') 234 235 Our XML now looks like this: 236 237 .. code-block:: xml 238 239 <?xml version="1.0"?> 240 <data> 241 <country name="Liechtenstein"> 242 <rank updated="yes">2</rank> 243 <year>2008</year> 244 <gdppc>141100</gdppc> 245 <neighbor name="Austria" direction="E"/> 246 <neighbor name="Switzerland" direction="W"/> 247 </country> 248 <country name="Singapore"> 249 <rank updated="yes">5</rank> 250 <year>2011</year> 251 <gdppc>59900</gdppc> 252 <neighbor name="Malaysia" direction="N"/> 253 </country> 254 </data> 255 256 Building XML documents 257 ^^^^^^^^^^^^^^^^^^^^^^ 258 259 The :func:`SubElement` function also provides a convenient way to create new 260 sub-elements for a given element:: 261 262 >>> a = ET.Element('a') 263 >>> b = ET.SubElement(a, 'b') 264 >>> c = ET.SubElement(a, 'c') 265 >>> d = ET.SubElement(c, 'd') 266 >>> ET.dump(a) 267 <a><b /><c><d /></c></a> 268 269 Additional resources 270 ^^^^^^^^^^^^^^^^^^^^ 271 272 See http://effbot.org/zone/element-index.htm for tutorials and links to other 273 docs. 274 275 .. _elementtree-xpath: 276 277 XPath support 278 ------------- 279 280 This module provides limited support for 281 `XPath expressions <http://www.w3.org/TR/xpath>`_ for locating elements in a 282 tree. The goal is to support a small subset of the abbreviated syntax; a full 283 XPath engine is outside the scope of the module. 284 285 Example 286 ^^^^^^^ 287 288 Here's an example that demonstrates some of the XPath capabilities of the 289 module. We'll be using the ``countrydata`` XML document from the 290 :ref:`Parsing XML <elementtree-parsing-xml>` section:: 291 292 import xml.etree.ElementTree as ET 293 294 root = ET.fromstring(countrydata) 295 296 # Top-level elements 297 root.findall(".") 298 299 # All 'neighbor' grand-children of 'country' children of the top-level 300 # elements 301 root.findall("./country/neighbor") 302 303 # Nodes with name='Singapore' that have a 'year' child 304 root.findall(".//year/..[@name='Singapore']") 305 306 # 'year' nodes that are children of nodes with name='Singapore' 307 root.findall(".//*[@name='Singapore']/year") 308 309 # All 'neighbor' nodes that are the second child of their parent 310 root.findall(".//neighbor[2]") 311 312 Supported XPath syntax 313 ^^^^^^^^^^^^^^^^^^^^^^ 314 315 .. tabularcolumns:: |l|L| 316 317 +-----------------------+------------------------------------------------------+ 318 | Syntax | Meaning | 319 +=======================+======================================================+ 320 | ``tag`` | Selects all child elements with the given tag. | 321 | | For example, ``spam`` selects all child elements | 322 | | named ``spam``, ``spam/egg`` selects all | 323 | | grandchildren named ``egg`` in all children named | 324 | | ``spam``. | 325 +-----------------------+------------------------------------------------------+ 326 | ``*`` | Selects all child elements. For example, ``*/egg`` | 327 | | selects all grandchildren named ``egg``. | 328 +-----------------------+------------------------------------------------------+ 329 | ``.`` | Selects the current node. This is mostly useful | 330 | | at the beginning of the path, to indicate that it's | 331 | | a relative path. | 332 +-----------------------+------------------------------------------------------+ 333 | ``//`` | Selects all subelements, on all levels beneath the | 334 | | current element. For example, ``.//egg`` selects | 335 | | all ``egg`` elements in the entire tree. | 336 +-----------------------+------------------------------------------------------+ 337 | ``..`` | Selects the parent element. | 338 +-----------------------+------------------------------------------------------+ 339 | ``[@attrib]`` | Selects all elements that have the given attribute. | 340 +-----------------------+------------------------------------------------------+ 341 | ``[@attrib='value']`` | Selects all elements for which the given attribute | 342 | | has the given value. The value cannot contain | 343 | | quotes. | 344 +-----------------------+------------------------------------------------------+ 345 | ``[tag]`` | Selects all elements that have a child named | 346 | | ``tag``. Only immediate children are supported. | 347 +-----------------------+------------------------------------------------------+ 348 | ``[position]`` | Selects all elements that are located at the given | 349 | | position. The position can be either an integer | 350 | | (1 is the first position), the expression ``last()`` | 351 | | (for the last position), or a position relative to | 352 | | the last position (e.g. ``last()-1``). | 353 +-----------------------+------------------------------------------------------+ 354 355 Predicates (expressions within square brackets) must be preceded by a tag 356 name, an asterisk, or another predicate. ``position`` predicates must be 357 preceded by a tag name. 358 359 Reference 360 --------- 39 361 40 362 .. _elementtree-functions: 41 363 42 364 Functions 43 --------- 44 45 46 .. function:: Comment([text]) 47 48 Comment element factory. This factory function creates a special element that 49 will be serialized as an XML comment. The comment string can be either an 8-bit 50 ASCII string or a Unicode string. *text* is a string containing the comment 51 string. Returns an element instance representing a comment. 365 ^^^^^^^^^ 366 367 368 .. function:: Comment(text=None) 369 370 Comment element factory. This factory function creates a special element 371 that will be serialized as an XML comment by the standard serializer. The 372 comment string can be either a bytestring or a Unicode string. *text* is a 373 string containing the comment string. Returns an element instance 374 representing a comment. 52 375 53 376 54 377 .. function:: dump(elem) 55 378 56 Writes an element tree or element structure to sys.stdout. This function should57 be used for debugging only.379 Writes an element tree or element structure to sys.stdout. This function 380 should be used for debugging only. 58 381 59 382 The exact output format is implementation dependent. In this version, it's … … 63 386 64 387 65 .. function:: Element(tag[, attrib][, **extra])66 67 Element factory. This function returns an object implementing the standard68 Element interface. The exact class or type of that object is implementation69 dependent, but it will always be compatible with the _ElementInterface class in70 this module.71 72 The element name, attribute names, and attribute values can be either 8-bit73 ASCII strings or Unicode strings. *tag* is the element name. *attrib* is an74 optional dictionary, containing element attributes. *extra* contains additional75 attributes, given as keyword arguments. Returns an element instance.76 77 78 388 .. function:: fromstring(text) 79 389 80 Parses an XML section from a string constant. Same as XML. *text* is a string 81 containing XML data. Returns an Element instance. 390 Parses an XML section from a string constant. Same as :func:`XML`. *text* 391 is a string containing XML data. Returns an :class:`Element` instance. 392 393 394 .. function:: fromstringlist(sequence, parser=None) 395 396 Parses an XML document from a sequence of string fragments. *sequence* is a 397 list or other sequence containing XML data fragments. *parser* is an 398 optional parser instance. If not given, the standard :class:`XMLParser` 399 parser is used. Returns an :class:`Element` instance. 400 401 .. versionadded:: 2.7 82 402 83 403 84 404 .. function:: iselement(element) 85 405 86 Checks if an object appears to be a valid element object. *element* is an87 element instance. Returns a true value if this is an element object.88 89 90 .. function:: iterparse(source [, events])406 Checks if an object appears to be a valid element object. *element* is an 407 element instance. Returns a true value if this is an element object. 408 409 410 .. function:: iterparse(source, events=None, parser=None) 91 411 92 412 Parses an XML section into an element tree incrementally, and reports what's 93 going on to the user. *source* is a filename or file object containing XML data. 94 *events* is a list of events to report back. If omitted, only "end" events are 95 reported. Returns an :term:`iterator` providing ``(event, elem)`` pairs. 413 going on to the user. *source* is a filename or file object containing XML 414 data. *events* is a list of events to report back. If omitted, only "end" 415 events are reported. *parser* is an optional parser instance. If not 416 given, the standard :class:`XMLParser` parser is used. *parser* is not 417 supported by ``cElementTree``. Returns an :term:`iterator` providing 418 ``(event, elem)`` pairs. 96 419 97 420 .. note:: … … 106 429 107 430 108 .. function:: parse(source[, parser]) 109 110 Parses an XML section into an element tree. *source* is a filename or file 111 object containing XML data. *parser* is an optional parser instance. If not 112 given, the standard XMLTreeBuilder parser is used. Returns an ElementTree 113 instance. 114 115 116 .. function:: ProcessingInstruction(target[, text]) 117 118 PI element factory. This factory function creates a special element that will 119 be serialized as an XML processing instruction. *target* is a string containing 120 the PI target. *text* is a string containing the PI contents, if given. Returns 121 an element instance, representing a processing instruction. 122 123 124 .. function:: SubElement(parent, tag[, attrib[, **extra]]) 125 126 Subelement factory. This function creates an element instance, and appends it 127 to an existing element. 128 129 The element name, attribute names, and attribute values can be either 8-bit 130 ASCII strings or Unicode strings. *parent* is the parent element. *tag* is the 131 subelement name. *attrib* is an optional dictionary, containing element 132 attributes. *extra* contains additional attributes, given as keyword arguments. 133 Returns an element instance. 134 135 136 .. function:: tostring(element[, encoding]) 137 138 Generates a string representation of an XML element, including all subelements. 139 *element* is an Element instance. *encoding* is the output encoding (default is 140 US-ASCII). Returns an encoded string containing the XML data. 141 142 143 .. function:: XML(text) 431 .. function:: parse(source, parser=None) 432 433 Parses an XML section into an element tree. *source* is a filename or file 434 object containing XML data. *parser* is an optional parser instance. If 435 not given, the standard :class:`XMLParser` parser is used. Returns an 436 :class:`ElementTree` instance. 437 438 439 .. function:: ProcessingInstruction(target, text=None) 440 441 PI element factory. This factory function creates a special element that 442 will be serialized as an XML processing instruction. *target* is a string 443 containing the PI target. *text* is a string containing the PI contents, if 444 given. Returns an element instance, representing a processing instruction. 445 446 447 .. function:: register_namespace(prefix, uri) 448 449 Registers a namespace prefix. The registry is global, and any existing 450 mapping for either the given prefix or the namespace URI will be removed. 451 *prefix* is a namespace prefix. *uri* is a namespace uri. Tags and 452 attributes in this namespace will be serialized with the given prefix, if at 453 all possible. 454 455 .. versionadded:: 2.7 456 457 458 .. function:: SubElement(parent, tag, attrib={}, **extra) 459 460 Subelement factory. This function creates an element instance, and appends 461 it to an existing element. 462 463 The element name, attribute names, and attribute values can be either 464 bytestrings or Unicode strings. *parent* is the parent element. *tag* is 465 the subelement name. *attrib* is an optional dictionary, containing element 466 attributes. *extra* contains additional attributes, given as keyword 467 arguments. Returns an element instance. 468 469 470 .. function:: tostring(element, encoding="us-ascii", method="xml") 471 472 Generates a string representation of an XML element, including all 473 subelements. *element* is an :class:`Element` instance. *encoding* [1]_ is 474 the output encoding (default is US-ASCII). *method* is either ``"xml"``, 475 ``"html"`` or ``"text"`` (default is ``"xml"``). Returns an encoded string 476 containing the XML data. 477 478 479 .. function:: tostringlist(element, encoding="us-ascii", method="xml") 480 481 Generates a string representation of an XML element, including all 482 subelements. *element* is an :class:`Element` instance. *encoding* [1]_ is 483 the output encoding (default is US-ASCII). *method* is either ``"xml"``, 484 ``"html"`` or ``"text"`` (default is ``"xml"``). Returns a list of encoded 485 strings containing the XML data. It does not guarantee any specific 486 sequence, except that ``"".join(tostringlist(element)) == 487 tostring(element)``. 488 489 .. versionadded:: 2.7 490 491 492 .. function:: XML(text, parser=None) 144 493 145 494 Parses an XML section from a string constant. This function can be used to 146 embed "XML literals" in Python code. *text* is a string containing XML data. 147 Returns an Element instance. 148 149 150 .. function:: XMLID(text) 495 embed "XML literals" in Python code. *text* is a string containing XML 496 data. *parser* is an optional parser instance. If not given, the standard 497 :class:`XMLParser` parser is used. Returns an :class:`Element` instance. 498 499 500 .. function:: XMLID(text, parser=None) 151 501 152 502 Parses an XML section from a string constant, and also returns a dictionary 153 which maps from element id:s to elements. *text* is a string containing XML 154 data. Returns a tuple containing an Element instance and a dictionary. 155 156 157 .. _elementtree-element-interface: 158 159 The Element Interface 160 --------------------- 161 162 Element objects returned by Element or SubElement have the following methods 163 and attributes. 164 165 166 .. attribute:: Element.tag 167 168 A string identifying what kind of data this element represents (the element 169 type, in other words). 170 171 172 .. attribute:: Element.text 173 174 The *text* attribute can be used to hold additional data associated with the 175 element. As the name implies this attribute is usually a string but may be any 176 application-specific object. If the element is created from an XML file the 177 attribute will contain any text found between the element tags. 178 179 180 .. attribute:: Element.tail 181 182 The *tail* attribute can be used to hold additional data associated with the 183 element. This attribute is usually a string but may be any application-specific 184 object. If the element is created from an XML file the attribute will contain 185 any text found after the element's end tag and before the next tag. 186 187 188 .. attribute:: Element.attrib 189 190 A dictionary containing the element's attributes. Note that while the *attrib* 191 value is always a real mutable Python dictionary, an ElementTree implementation 192 may choose to use another internal representation, and create the dictionary 193 only if someone asks for it. To take advantage of such implementations, use the 194 dictionary methods below whenever possible. 195 196 The following dictionary-like methods work on the element attributes. 197 198 199 .. method:: Element.clear() 200 201 Resets an element. This function removes all subelements, clears all 202 attributes, and sets the text and tail attributes to None. 203 204 205 .. method:: Element.get(key[, default=None]) 206 207 Gets the element attribute named *key*. 208 209 Returns the attribute value, or *default* if the attribute was not found. 210 211 212 .. method:: Element.items() 213 214 Returns the element attributes as a sequence of (name, value) pairs. The 215 attributes are returned in an arbitrary order. 216 217 218 .. method:: Element.keys() 219 220 Returns the elements attribute names as a list. The names are returned in an 221 arbitrary order. 222 223 224 .. method:: Element.set(key, value) 225 226 Set the attribute *key* on the element to *value*. 227 228 The following methods work on the element's children (subelements). 229 230 231 .. method:: Element.append(subelement) 232 233 Adds the element *subelement* to the end of this elements internal list of 234 subelements. 235 236 237 .. method:: Element.find(match) 238 239 Finds the first subelement matching *match*. *match* may be a tag name or path. 240 Returns an element instance or ``None``. 241 242 243 .. method:: Element.findall(match) 244 245 Finds all subelements matching *match*. *match* may be a tag name or path. 246 Returns an iterable yielding all matching elements in document order. 247 248 249 .. method:: Element.findtext(condition[, default=None]) 250 251 Finds text for the first subelement matching *condition*. *condition* may be a 252 tag name or path. Returns the text content of the first matching element, or 253 *default* if no element was found. Note that if the matching element has no 254 text content an empty string is returned. 255 256 257 .. method:: Element.getchildren() 258 259 Returns all subelements. The elements are returned in document order. 260 261 262 .. method:: Element.getiterator([tag=None]) 263 264 Creates a tree iterator with the current element as the root. The iterator 265 iterates over this element and all elements below it, in document (depth first) 266 order. If *tag* is not ``None`` or ``'*'``, only elements whose tag equals 267 *tag* are returned from the iterator. 268 269 270 .. method:: Element.insert(index, element) 271 272 Inserts a subelement at the given position in this element. 273 274 275 .. method:: Element.makeelement(tag, attrib) 276 277 Creates a new element object of the same type as this element. Do not call this 278 method, use the SubElement factory function instead. 279 280 281 .. method:: Element.remove(subelement) 282 283 Removes *subelement* from the element. Unlike the findXYZ methods this method 284 compares elements based on the instance identity, not on tag value or contents. 285 286 Element objects also support the following sequence type methods for working 287 with subelements: :meth:`__delitem__`, :meth:`__getitem__`, :meth:`__setitem__`, 288 :meth:`__len__`. 289 290 Caution: Because Element objects do not define a :meth:`__nonzero__` method, 291 elements with no subelements will test as ``False``. :: 292 293 element = root.find('foo') 294 295 if not element: # careful! 296 print "element not found, or element has no subelements" 297 298 if element is None: 299 print "element not found" 503 which maps from element id:s to elements. *text* is a string containing XML 504 data. *parser* is an optional parser instance. If not given, the standard 505 :class:`XMLParser` parser is used. Returns a tuple containing an 506 :class:`Element` instance and a dictionary. 507 508 509 .. _elementtree-element-objects: 510 511 Element Objects 512 ^^^^^^^^^^^^^^^ 513 514 .. class:: Element(tag, attrib={}, **extra) 515 516 Element class. This class defines the Element interface, and provides a 517 reference implementation of this interface. 518 519 The element name, attribute names, and attribute values can be either 520 bytestrings or Unicode strings. *tag* is the element name. *attrib* is 521 an optional dictionary, containing element attributes. *extra* contains 522 additional attributes, given as keyword arguments. 523 524 525 .. attribute:: tag 526 527 A string identifying what kind of data this element represents (the 528 element type, in other words). 529 530 531 .. attribute:: text 532 533 The *text* attribute can be used to hold additional data associated with 534 the element. As the name implies this attribute is usually a string but 535 may be any application-specific object. If the element is created from 536 an XML file the attribute will contain any text found between the element 537 tags. 538 539 540 .. attribute:: tail 541 542 The *tail* attribute can be used to hold additional data associated with 543 the element. This attribute is usually a string but may be any 544 application-specific object. If the element is created from an XML file 545 the attribute will contain any text found after the element's end tag and 546 before the next tag. 547 548 549 .. attribute:: attrib 550 551 A dictionary containing the element's attributes. Note that while the 552 *attrib* value is always a real mutable Python dictionary, an ElementTree 553 implementation may choose to use another internal representation, and 554 create the dictionary only if someone asks for it. To take advantage of 555 such implementations, use the dictionary methods below whenever possible. 556 557 The following dictionary-like methods work on the element attributes. 558 559 560 .. method:: clear() 561 562 Resets an element. This function removes all subelements, clears all 563 attributes, and sets the text and tail attributes to None. 564 565 566 .. method:: get(key, default=None) 567 568 Gets the element attribute named *key*. 569 570 Returns the attribute value, or *default* if the attribute was not found. 571 572 573 .. method:: items() 574 575 Returns the element attributes as a sequence of (name, value) pairs. The 576 attributes are returned in an arbitrary order. 577 578 579 .. method:: keys() 580 581 Returns the elements attribute names as a list. The names are returned 582 in an arbitrary order. 583 584 585 .. method:: set(key, value) 586 587 Set the attribute *key* on the element to *value*. 588 589 The following methods work on the element's children (subelements). 590 591 592 .. method:: append(subelement) 593 594 Adds the element *subelement* to the end of this elements internal list 595 of subelements. 596 597 598 .. method:: extend(subelements) 599 600 Appends *subelements* from a sequence object with zero or more elements. 601 Raises :exc:`AssertionError` if a subelement is not a valid object. 602 603 .. versionadded:: 2.7 604 605 606 .. method:: find(match) 607 608 Finds the first subelement matching *match*. *match* may be a tag name 609 or path. Returns an element instance or ``None``. 610 611 612 .. method:: findall(match) 613 614 Finds all matching subelements, by tag name or path. Returns a list 615 containing all matching elements in document order. 616 617 618 .. method:: findtext(match, default=None) 619 620 Finds text for the first subelement matching *match*. *match* may be 621 a tag name or path. Returns the text content of the first matching 622 element, or *default* if no element was found. Note that if the matching 623 element has no text content an empty string is returned. 624 625 626 .. method:: getchildren() 627 628 .. deprecated:: 2.7 629 Use ``list(elem)`` or iteration. 630 631 632 .. method:: getiterator(tag=None) 633 634 .. deprecated:: 2.7 635 Use method :meth:`Element.iter` instead. 636 637 638 .. method:: insert(index, element) 639 640 Inserts a subelement at the given position in this element. 641 642 643 .. method:: iter(tag=None) 644 645 Creates a tree :term:`iterator` with the current element as the root. 646 The iterator iterates over this element and all elements below it, in 647 document (depth first) order. If *tag* is not ``None`` or ``'*'``, only 648 elements whose tag equals *tag* are returned from the iterator. If the 649 tree structure is modified during iteration, the result is undefined. 650 651 .. versionadded:: 2.7 652 653 654 .. method:: iterfind(match) 655 656 Finds all matching subelements, by tag name or path. Returns an iterable 657 yielding all matching elements in document order. 658 659 .. versionadded:: 2.7 660 661 662 .. method:: itertext() 663 664 Creates a text iterator. The iterator loops over this element and all 665 subelements, in document order, and returns all inner text. 666 667 .. versionadded:: 2.7 668 669 670 .. method:: makeelement(tag, attrib) 671 672 Creates a new element object of the same type as this element. Do not 673 call this method, use the :func:`SubElement` factory function instead. 674 675 676 .. method:: remove(subelement) 677 678 Removes *subelement* from the element. Unlike the find\* methods this 679 method compares elements based on the instance identity, not on tag value 680 or contents. 681 682 :class:`Element` objects also support the following sequence type methods 683 for working with subelements: :meth:`~object.__delitem__`, 684 :meth:`~object.__getitem__`, :meth:`~object.__setitem__`, 685 :meth:`~object.__len__`. 686 687 Caution: Elements with no subelements will test as ``False``. This behavior 688 will change in future versions. Use specific ``len(elem)`` or ``elem is 689 None`` test instead. :: 690 691 element = root.find('foo') 692 693 if not element: # careful! 694 print "element not found, or element has no subelements" 695 696 if element is None: 697 print "element not found" 300 698 301 699 … … 303 701 304 702 ElementTree Objects 305 ------------------- 306 307 308 .. class:: ElementTree([element,] [file]) 309 310 ElementTree wrapper class. This class represents an entire element hierarchy, 311 and adds some extra support for serialization to and from standard XML. 312 313 *element* is the root element. The tree is initialized with the contents of the 314 XML *file* if given. 703 ^^^^^^^^^^^^^^^^^^^ 704 705 706 .. class:: ElementTree(element=None, file=None) 707 708 ElementTree wrapper class. This class represents an entire element 709 hierarchy, and adds some extra support for serialization to and from 710 standard XML. 711 712 *element* is the root element. The tree is initialized with the contents 713 of the XML *file* if given. 315 714 316 715 … … 319 718 Replaces the root element for this tree. This discards the current 320 719 contents of the tree, and replaces it with the given element. Use with 321 care. *element* is an element instance. 322 323 324 .. method:: find(path) 325 326 Finds the first toplevel element with given tag. Same as 327 getroot().find(path). *path* is the element to look for. Returns the 328 first matching element, or ``None`` if no element was found. 329 330 331 .. method:: findall(path) 332 333 Finds all toplevel elements with the given tag. Same as 334 getroot().findall(path). *path* is the element to look for. Returns a 335 list or :term:`iterator` containing all matching elements, in document 336 order. 337 338 339 .. method:: findtext(path[, default]) 340 341 Finds the element text for the first toplevel element with given tag. 342 Same as getroot().findtext(path). *path* is the toplevel element to look 343 for. *default* is the value to return if the element was not 344 found. Returns the text content of the first matching element, or the 345 default value no element was found. Note that if the element has is 346 found, but has no text content, this method returns an empty string. 347 348 349 .. method:: getiterator([tag]) 720 care. *element* is an element instance. 721 722 723 .. method:: find(match) 724 725 Same as :meth:`Element.find`, starting at the root of the tree. 726 727 728 .. method:: findall(match) 729 730 Same as :meth:`Element.findall`, starting at the root of the tree. 731 732 733 .. method:: findtext(match, default=None) 734 735 Same as :meth:`Element.findtext`, starting at the root of the tree. 736 737 738 .. method:: getiterator(tag=None) 739 740 .. deprecated:: 2.7 741 Use method :meth:`ElementTree.iter` instead. 742 743 744 .. method:: getroot() 745 746 Returns the root element for this tree. 747 748 749 .. method:: iter(tag=None) 350 750 351 751 Creates and returns a tree iterator for the root element. The iterator 352 loops over all elements in this tree, in section order. *tag* is the tag752 loops over all elements in this tree, in section order. *tag* is the tag 353 753 to look for (default is to return all elements) 354 754 355 755 356 .. method:: getroot() 357 358 Returns the root element for this tree. 359 360 361 .. method:: parse(source[, parser]) 362 363 Loads an external XML section into this element tree. *source* is a file 364 name or file object. *parser* is an optional parser instance. If not 365 given, the standard XMLTreeBuilder parser is used. Returns the section 756 .. method:: iterfind(match) 757 758 Finds all matching subelements, by tag name or path. Same as 759 getroot().iterfind(match). Returns an iterable yielding all matching 760 elements in document order. 761 762 .. versionadded:: 2.7 763 764 765 .. method:: parse(source, parser=None) 766 767 Loads an external XML section into this element tree. *source* is a file 768 name or file object. *parser* is an optional parser instance. If not 769 given, the standard XMLParser parser is used. Returns the section 366 770 root element. 367 771 368 772 369 .. method:: write(file[, encoding]) 370 371 Writes the element tree to a file, as XML. *file* is a file name, or a 372 file object opened for writing. *encoding* [1]_ is the output encoding 373 (default is US-ASCII). 773 .. method:: write(file, encoding="us-ascii", xml_declaration=None, \ 774 default_namespace=None, method="xml") 775 776 Writes the element tree to a file, as XML. *file* is a file name, or a 777 file object opened for writing. *encoding* [1]_ is the output encoding 778 (default is US-ASCII). *xml_declaration* controls if an XML declaration 779 should be added to the file. Use False for never, True for always, None 780 for only if not US-ASCII or UTF-8 (default is None). *default_namespace* 781 sets the default XML namespace (for "xmlns"). *method* is either 782 ``"xml"``, ``"html"`` or ``"text"`` (default is ``"xml"``). Returns an 783 encoded string. 374 784 375 785 This is the XML file that is going to be manipulated:: … … 390 800 >>> tree = ElementTree() 391 801 >>> tree.parse("index.xhtml") 392 <Element html at b7d3f1ec>802 <Element 'html' at 0xb77e6fac> 393 803 >>> p = tree.find("body/p") # Finds first occurrence of tag p in body 394 804 >>> p 395 <Element p at 8416e0c>396 >>> links = p.getiterator("a")# Returns list of all links805 <Element 'p' at 0xb77ec26c> 806 >>> links = list(p.iter("a")) # Returns list of all links 397 807 >>> links 398 [<Element a at b7d4f9ec>, <Element a at b7d4fb0c>]808 [<Element 'a' at 0xb77ec2ac>, <Element 'a' at 0xb77ec1cc>] 399 809 >>> for i in links: # Iterates through all found links 400 810 ... i.attrib["target"] = "blank" … … 404 814 405 815 QName Objects 406 ------------- 407 408 409 .. class:: QName(text_or_uri [, tag])410 411 QName wrapper. This can be used to wrap a QName attribute value, in order to412 get proper namespace handling on output. *text_or_uri* is a string containing413 the QName value, in the form {uri}local, or, if the tag argument is given, the414 URI part of a QName. If *tag* is given, the first argument is interpreted as an415 URI, and this argument is interpreted as a local name. :class:`QName` instances416 are opaque.816 ^^^^^^^^^^^^^ 817 818 819 .. class:: QName(text_or_uri, tag=None) 820 821 QName wrapper. This can be used to wrap a QName attribute value, in order 822 to get proper namespace handling on output. *text_or_uri* is a string 823 containing the QName value, in the form {uri}local, or, if the tag argument 824 is given, the URI part of a QName. If *tag* is given, the first argument is 825 interpreted as an URI, and this argument is interpreted as a local name. 826 :class:`QName` instances are opaque. 417 827 418 828 … … 420 830 421 831 TreeBuilder Objects 422 ------------------- 423 424 425 .. class:: TreeBuilder( [element_factory])426 427 Generic element structure builder. This builder converts a sequence of start,428 data, and end method calls to a well-formed element structure. You can use this429 c lass to build an element structure using a custom XML parser, or a parser for430 some other XML-like format. The *element_factory* is called to create new431 Elementinstances when given.832 ^^^^^^^^^^^^^^^^^^^ 833 834 835 .. class:: TreeBuilder(element_factory=None) 836 837 Generic element structure builder. This builder converts a sequence of 838 start, data, and end method calls to a well-formed element structure. You 839 can use this class to build an element structure using a custom XML parser, 840 or a parser for some other XML-like format. The *element_factory* is called 841 to create new :class:`Element` instances when given. 432 842 433 843 434 844 .. method:: close() 435 845 436 Flushes the parser buffers, and returns the toplevel document437 element. Returns an Elementinstance.846 Flushes the builder buffers, and returns the toplevel document 847 element. Returns an :class:`Element` instance. 438 848 439 849 440 850 .. method:: data(data) 441 851 442 Adds text to the current element. *data* is a string. This should be443 either a n 8-bit string containing ASCII text, or a Unicode string.852 Adds text to the current element. *data* is a string. This should be 853 either a bytestring, or a Unicode string. 444 854 445 855 446 856 .. method:: end(tag) 447 857 448 Closes the current element. *tag* is the element name. Returns the closed449 element.858 Closes the current element. *tag* is the element name. Returns the 859 closed element. 450 860 451 861 452 862 .. method:: start(tag, attrs) 453 863 454 Opens a new element. *tag* is the element name. *attrs* is a dictionary 455 containing element attributes. Returns the opened element. 456 457 458 .. _elementtree-xmltreebuilder-objects: 459 460 XMLTreeBuilder Objects 461 ---------------------- 462 463 464 .. class:: XMLTreeBuilder([html,] [target]) 465 466 Element structure builder for XML source data, based on the expat parser. *html* 467 are predefined HTML entities. This flag is not supported by the current 468 implementation. *target* is the target object. If omitted, the builder uses an 469 instance of the standard TreeBuilder class. 864 Opens a new element. *tag* is the element name. *attrs* is a dictionary 865 containing element attributes. Returns the opened element. 866 867 868 In addition, a custom :class:`TreeBuilder` object can provide the 869 following method: 870 871 .. method:: doctype(name, pubid, system) 872 873 Handles a doctype declaration. *name* is the doctype name. *pubid* is 874 the public identifier. *system* is the system identifier. This method 875 does not exist on the default :class:`TreeBuilder` class. 876 877 .. versionadded:: 2.7 878 879 880 .. _elementtree-xmlparser-objects: 881 882 XMLParser Objects 883 ^^^^^^^^^^^^^^^^^ 884 885 886 .. class:: XMLParser(html=0, target=None, encoding=None) 887 888 :class:`Element` structure builder for XML source data, based on the expat 889 parser. *html* are predefined HTML entities. This flag is not supported by 890 the current implementation. *target* is the target object. If omitted, the 891 builder uses an instance of the standard TreeBuilder class. *encoding* [1]_ 892 is optional. If given, the value overrides the encoding specified in the 893 XML file. 470 894 471 895 472 896 .. method:: close() 473 897 474 Finishes feeding data to the parser. Returns an element structure.898 Finishes feeding data to the parser. Returns an element structure. 475 899 476 900 477 901 .. method:: doctype(name, pubid, system) 478 902 479 Handles a doctype declaration. *name* is the doctype name. *pubid* is the 480 public identifier. *system* is the system identifier. 903 .. deprecated:: 2.7 904 Define the :meth:`TreeBuilder.doctype` method on a custom TreeBuilder 905 target. 481 906 482 907 483 908 .. method:: feed(data) 484 909 485 Feeds data to the parser. *data* is encoded data.486 487 :meth:`XML TreeBuilder.feed` calls *target*\'s :meth:`start` method910 Feeds data to the parser. *data* is encoded data. 911 912 :meth:`XMLParser.feed` calls *target*\'s :meth:`start` method 488 913 for each opening tag, its :meth:`end` method for each closing tag, 489 and data is processed by method :meth:`data`. :meth:`XMLTreeBuilder.close`914 and data is processed by method :meth:`data`. :meth:`XMLParser.close` 490 915 calls *target*\'s method :meth:`close`. 491 :class:`XML TreeBuilder` can be used not only for building a tree structure.916 :class:`XMLParser` can be used not only for building a tree structure. 492 917 This is an example of counting the maximum depth of an XML file:: 493 918 494 >>> from xml.etree.ElementTree import XML TreeBuilder919 >>> from xml.etree.ElementTree import XMLParser 495 920 >>> class MaxDepth: # The target object of the parser 496 921 ... maxDepth = 0 … … 508 933 ... 509 934 >>> target = MaxDepth() 510 >>> parser = XML TreeBuilder(target=target)935 >>> parser = XMLParser(target=target) 511 936 >>> exampleXml = """ 512 937 ... <a> … … 528 953 529 954 .. [#] The encoding string included in XML output should conform to the 530 appropriate standards. For example, "UTF-8" is valid, but "UTF8" is531 not. See http://www.w3.org/TR/2006/REC-xml11-20060816/#NT-EncodingDecl955 appropriate standards. For example, "UTF-8" is valid, but "UTF8" is 956 not. See http://www.w3.org/TR/2006/REC-xml11-20060816/#NT-EncodingDecl 532 957 and http://www.iana.org/assignments/character-sets. 533
Note:
See TracChangeset
for help on using the changeset viewer.