How to create <!DOCTYPE> with Python's cElementTree
You could set xml_declaration argument on write
function to False
, so output won't have xml declaration with encoding, then just append what header you need manually. Actually if you set your encoding as 'utf-8' (lowercase), xml declaration won't be added too.
import xml.etree.cElementTree as ElementTree
tree = ElementTree.Element('tmx', {'version': '1.4a'})
ElementTree.SubElement(tree, 'header', {'adminlang': 'EN'})
ElementTree.SubElement(tree, 'body')
with open('myfile.tmx', 'wb') as f:
f.write('<?xml version="1.0" encoding="UTF-8" ?><!DOCTYPE tmx SYSTEM "tmx14a.dtd">'.encode('utf8'))
ElementTree.ElementTree(tree).write(f, 'utf-8')
Resulting file (newlines added manually for readability):
<?xml version="1.0" encoding="UTF-8" ?>
<!DOCTYPE tmx SYSTEM "tmx14a.dtd">
<tmx version="1.4a">
<header adminlang="EN" />
<body />
</tmx>
You could use lxml and its tostring
function:
from lxml import etree
s = """<?xml version="1.0" encoding="UTF-8"?>
<tmx version="1.4a"/>"""
tree = etree.fromstring(s)
header = etree.SubElement(tree,'header',{'adminlang': 'EN'})
body = etree.SubElement(tree,'body')
print etree.tostring(tree, encoding="UTF-8",
xml_declaration=True,
pretty_print=True,
doctype='<!DOCTYPE tmx SYSTEM "tmx14a.dtd">')
=>
<?xml version='1.0' encoding='UTF-8'?>
<!DOCTYPE tmx SYSTEM "tmx14a.dtd">
<tmx version="1.4a">
<header adminlang="EN"/>
<body/>
</tmx>