handyxml

Handyxml is a Python module that makes simple XML use convenient. It provides Python attributes for XML attributes and child elements, and provides a facade over two common XML parsers and an XPath implementation. It’s designed for casual XML use in environments such as template engines where data is stored in XML files.

Download: handyxml.py

Usage

Imagine a simple XML file:

<!-- MyDataFile.xml -->
<mydata group='ducks'>
    <property name='duck1' value='huey' />
    <property name='duck2' value='louie' />
    <property name='duck3' value='dewey' />
</mydata>

To parse the file, simply use the xml function:

>> import handyxml
>> doc = handyxml.xml('MyDataFile.xml')

It accepts either a filename or an open file as its argument, and parses the XML it finds. Its return value is the document element, and attributes on the element are available as attributes on the object:

>> doc.group
u'ducks'

The names of child elements are object attributes with lists of the child elements:

>> len(doc.property)
3
>> doc.property[1].name
u'duck2'

Unlike other Python XML-to-dictionary schemes, these objects act just like full-fledged DOM nodes, and so can be processed with other XML facilities.

Handyxml also provides the xpath function for evaluating XPath expressions. The first argument is either an argument acceptable to xml(), or a parsed XML node. The second argument is the XPath expression to evaluate against the XML. The return is the resulting list of XML nodes:

>> duck3 = handyxml.xpath(doc, '//property[@name="duck3"]')
>> duck3[0].value
u'dewey'

When looking for a file, handyxml will search the list of directories specified in its path value, much like sys.path.

Changes

27 January 2004: The xml.xpath module is no longer required, and the xpath() function will raise an exception if the module couldn’t be imported. The xml() function can now take either a string (filename) or an open file as its argument.

Comments

[gravatar]
>> handyxml.xpath(doc, '//property[@name="duck3"]')

Would not it be nicer to have

doc.xpath('...')
[gravatar]
Your handyxml.py has:

from xml import xpath as ...

but that raises an ImportError in my Python 2.3.3 version. Is xml.xpath from a 3rd party?
[gravatar]
Zoran: it might be nicer to have doc.xpath(), but I wanted to stay away from adding attributes. Since any object attribute would collide with an XML attribute, I thought I should stay out of the way of the XML data.

Any other opinions?
[gravatar]
I like the Pythonic access to nodes/attributes in XML docs. However, I don't see how I can navigate through the document without apriori knowledge of the names of elements. Is there a way using xmlhandy (or underlying modules) to "discover" at run-time the structure and element names?
[gravatar]
For more advanced use (like discovering the structure), you'll need to use the DOM interfaces that underly handyxml. They're all still available: childNodes, nodeName, etc.
[gravatar]
with python 2.3.3 when I try
import handyxml
I get ImportError: No module named ext.reader

I just have the standard python instilation nothing special. Am I missing a module?
[gravatar]
nevermine I had not installed pyXML which I guess you need to do first
[gravatar]
Are there any handy xml parsers that also provide read/write access to the element tree in-memory? I don't want to clone nodes unnecessarily.
[gravatar]
Grigory Bakunov 6:24 AM on 19 Jun 2004
Looks good enough.
But how can I get CDATA?
[gravatar]
Sorry, but I don't provide any special access to the character data of an element (because I didn't need it). You can use the DOM methods on the element, though.
[gravatar]
I just started learning Python, and I stumbled across this while looking for something to make simple XML processing easier. Very nice.

I am working on a console program that will load an xml file, and then allow repeated user input to work with that XML. I want the user to be able to make changes to the XML file on disk, and then reload it without having to exit the program. The built-in cache in handyxml prevented this.

I just added a second argument to the xml function: xml(xmlin, cache=True), and then whenever the _xmlcache is accessed it checks the cache flag first. This way I can specify that a particular file not be cached.

Not sure if this would be useful to others, but thought I would pass it along in case you were still updating this script.
[gravatar]
Just a note to say that handyxml is broken with python-2.4.3. I believe that the solution is to replace the use of xml.dom.ext.reader.PyExpat.Reader().fromString with xml.dom.minidom.parseString
[gravatar]
I am completely new to Python, so maybe others did figure it out on their own and I can't.

Since PyXML is not available anymore, how to use handyxml and Cog using currently available tools?

Which modules to install and which modifications to make?

Add a comment:

Ignore this:
Leave this empty:
Name is required. Either email or web are required. Email won't be displayed and I won't spam you. Your web site won't be indexed by search engines.
Don't put anything here:
Leave this empty:
Comment text is Markdown.