Search code examples
pythonxmlxmltodict

Python xmltodict: How to preserve XML element order?


I'm using xmltodict for XML parsing/unparsing, and I need to preserve the XML element ordering while processing one document. Toy REPL example:

>>> import xmltodict
>>> xml = """
... <root>
...   <a />
...   <b />
...   <a />
... </root>
... """
>>> xmltodict.parse(xml)
OrderedDict([('root', OrderedDict([('a', [None, None]), ('b', None)]))])
>>> xmltodict.unparse(_)
'<?xml version="1.0" encoding="utf-8"?>\n<root><a></a><a></a><b></b></root>'

Note that the original sequence [a, b, a] is replaced by [a, a, b]. Is there any way to preserve the original order with xmltodict?


Solution

  • It's not super elegant, but minidom can do the job just fine:

    import xml.dom.minidom as minidom
    
    xml = """
    <root>
    <a />
    <b />
    <a />
    </root>
    """
    doc = minidom.parseString(xml)                  # or minidom.parse(filename)
    root = doc.getElementsByTagName('root')[0]      # or doc.documentElement
    items = [n for n in root.childNodes if n.nodeType == doc.ELEMENT_NODE]
    
    for item in items:
        print item.nodeName
    

    You can of course use a full-blown DOM API like lxml but for the modest task of iterating some nodes in document order it's maybe not necessary.