XML Processing in Python
Personally, I've played with several of the built-in options on an XML-heavy project and have settled on pulldom as the best choice for less complex documents.
Especially for small simple stuff, I like the event-driven theory of parsing rather than setting up a whole slew of callbacks for a relatively simple structure. Here is a good quick discussion of how to use the API.
What I like: you can handle the parsing in a for
loop rather than using callbacks. You also delay full parsing (the "pull" part) and only get additional detail when you call expandNode()
. This satisfies my general requirement for "responsible" efficiency without sacrificing ease of use and simplicity.
ElementTree has a nice pythony API. I think it's even shipped as part of python 2.5
It's in pure python and as I say, pretty nice, but if you wind up needing more performance, then lxml exposes the same API and uses libxml2 under the hood. You can theoretically just swap it in when you discover you need it.