Search code examples
pythonxmlelementtree

How to generate an XML file via Python's ElementTree with registered namespaces written in output


I need to generate a XML file based on a given schema. This schema dictates I need to make use of a namespace prefix for the elements in the generated XML file.

I need to use cElementTree for backwards compatibility reasons. At the same time, I want to pretty-print the XML output, i.e. with indentations. I know this can be done via xml.dom.

Here's what I have tried:

import sys
import cElementTree as ET
from xml.dom import minidom
ET.register_namespace('xs', 'http://www.w3.org/2001/XMLSchema')
root = ET.Element('House')
ET.SubElement(root, 'Room')
etreeString = ET.tostring(root, 'utf-8')

Output for the above code is:

<House><Room /></House>

How can I get the elements properly prefixed with the standard namespace? Also, how can I get an XML file that contains the XML declaration at the top?

I have tried creating an instance of an xml.etree.ElementTree.ElementTree class, to use the write method, as in here:

tree = ET.ElementTree(root)
tree.write(sys.stdout)

but once again, I get no namespaces:

<House><Room /></House>

Finally, if I try to add the prefixes upon creation of each element (which feels weird), xml.dom would not parse it because I do not know how to instruct a parsing with namespace prefixes:

>>> kitchenElem = ET.SubElement(root, 'xs:Kitchen')
>>> tree = ET.ElementTree(root)
>>> tree.write(sys.stdout)
<House><Room /><xs:Kitchen /><xs:Kitchen /></House>
>>> etreeString = ET.tostring(root, 'utf-8')
>>> etreeString
'<House><Room /><xs:Kitchen /><xs:Kitchen /></House>'
>>> minidomParsed = minidom.parseString(etreeString)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "ext\vc12_win32\lib\python2.7\site-packages\_xmlplus\dom\minidom.py", line 1925, in parseString
  File "ext\vc12_win32\lib\python2.7\site-packages\_xmlplus\dom\expatbuilder.py", line 942, in parseString
  File "ext\vc12_win32\lib\python2.7\site-packages\_xmlplus\dom\expatbuilder.py", line 223, in parseString
ExpatError: unbound prefix: line 1, column 15

Solution

  • To get a properly prefixed name, try using QName().

    To write the XML with the XML declaration, try using xml_declaration=True in the ElementTree.write().

    Example...

    Python

    import xml.etree.cElementTree as ET
    
    ns = {"xs": "http://www.w3.org/2001/XMLSchema"}
    
    ET.register_namespace('xs', ns["xs"])
    root = ET.Element(ET.QName(ns["xs"], "House"))
    ET.SubElement(root, ET.QName(ns["xs"], "Room"))
    
    ET.ElementTree(root).write("output.xml", xml_declaration=True, encoding="utf-8")
    

    XML Output

    <?xml version='1.0' encoding='utf-8'?>
    <xs:House xmlns:xs="http://www.w3.org/2001/XMLSchema"><xs:Room /></xs:House>
    

    Note: You don't have to use the ns dictionary. I just used it so I didn't have the full namespace uri everywhere.