Pretty printing XML in Python

import xml.dom.minidom

dom = xml.dom.minidom.parse(xml_fname) # or xml.dom.minidom.parseString(xml_string)
pretty_xml_as_string = dom.toprettyxml()

Another solution is to borrow this indent function, for use with the ElementTree library that's built in to Python since 2.5. Here's what that would look like:

from xml.etree import ElementTree

def indent(elem, level=0):
    i = "\n" + level*"  "
    j = "\n" + (level-1)*"  "
    if len(elem):
        if not elem.text or not elem.text.strip():
            elem.text = i + "  "
        if not elem.tail or not elem.tail.strip():
            elem.tail = i
        for subelem in elem:
            indent(subelem, level+1)
        if not elem.tail or not elem.tail.strip():
            elem.tail = j
    else:
        if level and (not elem.tail or not elem.tail.strip()):
            elem.tail = j
    return elem        

root = ElementTree.parse('/tmp/xmlfile').getroot()
indent(root)
ElementTree.dump(root)

You have a few options.

xml.etree.ElementTree.indent()

Batteries included, simple to use, pretty output.

But requires Python 3.9+

import xml.etree.ElementTree as ET

element = ET.XML("<html><body>text</body></html>")
ET.indent(element)
print(ET.tostring(element, encoding='unicode'))

BeautifulSoup.prettify()

BeautifulSoup may be the simplest solution for Python < 3.9.

from bs4 import BeautifulSoup

bs = BeautifulSoup(open(xml_file), 'xml')
pretty_xml = bs.prettify()
print(pretty_xml)

Output:

<?xml version="1.0" encoding="utf-8"?>
<issues>
 <issue>
  <id>
   1
  </id>
  <title>
   Add Visual Studio 2005 and 2008 solution files
  </title>
 </issue>
</issues>

This is my goto answer. The default arguments work as is. But text contents are spread out on separate lines as if they were nested elements.

lxml.etree.parse()

Prettier output but with arguments.

from lxml import etree

x = etree.parse(FILE_NAME)
pretty_xml = etree.tostring(x, pretty_print=True, encoding=str)

Produces:

  <issues>
    <issue>
      <id>1</id>
      <title>Add Visual Studio 2005 and 2008 solution files</title>
      <details>We need Visual Studio 2005/2008 project files for Windows.</details>
    </issue>
  </issues>

This works for me with no issues.

xml.dom.minidom.parse()

No external dependencies but post-processing.

import xml.dom.minidom as md

dom = md.parse(FILE_NAME)     
# To parse string instead use: dom = md.parseString(xml_string)
pretty_xml = dom.toprettyxml()
# remove the weird newline issue:
pretty_xml = os.linesep.join([s for s in pretty_xml.splitlines()
                              if s.strip()])

The output is the same as above, but it's more code.

lxml is recent, updated, and includes a pretty print function

import lxml.etree as etree

x = etree.parse("filename")
print etree.tostring(x, pretty_print=True)

Check out the lxml tutorial: http://lxml.de/tutorial.html

Pretty printing XML in Python

xml.etree.ElementTree.indent()

BeautifulSoup.prettify()

lxml.etree.parse()

xml.dom.minidom.parse()

Tags:

Python

Xml

Pretty Print

Related

Recent Posts