Search code examples
pythonxmlelementtreepretty-printxml.etree

Formatting inserted elements using python xml.etree module, to include new lines


I am inserting a single element into a large xml file. I want the inserted element to be at the top (so I need to use the root.insert method, and can't just append to the file). I would also like the formatting of the element to match the rest of the file.

The original XML file has the format

<a>
    <b>
        <c/>
    </b>
    <d>
        <e/>
    </d>
    ....
</a>

I then run the following code:

import xml.etree.ElementTree as ET    

xmlfile = ET.parse('file.xml')
a = xmlfile.getroot()

f = ET.Element('f')
g = ET.SubElement(f,'g')

a.insert(1, f)

xmlfile.write('file.xml')

Which creates an output in the form:

<a>
    <b>
        <c/>
    </b>
    <f><g/></f><d>
        <e/>
    </d>
    ....
</a>

but I would like it in the form:

<a>
    <b>
        <c/>
    </b>
    <f>
        <g/>
    </f>
    <d>
        <e/>
    </d>
    ....
</a>

Using Jonathan Eunice's solution to the question 'How do I get Python's ElementTree to pretty print to an XML file?' I have added the following code to replace the xmlfile.write command:

from xml.dom import minidom
xmlstr = minidom.parseString(ET.tostring(a)).toprettyxml(indent="   ")
with open("New_Database.xml", "w") as f:
    f.write(xmlstr)

However the formatting for the whole file is still not correct. It formats the new element correctly, but the original elements are now spaced out:

<b>


    <c/>


</b>


<f>
    <g/>
</f>
<c>


    <d/>


</c>
....
</a>

I think this is because toprettyxml() command adds a new line at the '\n' delimiter (hence adds 2 new lines to the current formatting). Fiddling with the inputs just changes whether the added element or the original elements are formatted incorrectly. I need a method to modify the new element or the original elements before I add the new one in, so that their formatting is the same, then I can reformat the whole lot before printing? Is it possible to add formatting using 'xml.etree.ElementTree'?

Thanks in advance.


Solution

  • It is possible to fiddle with the whitespace using the text and tail properties. Perhaps this is good enough for you. See demo below.

    Input document:

    <a>
        <b>
            <c/>
        </b>
        <d>
            <e/>
        </d>
    </a>
    

    Code:

    import xml.etree.ElementTree as ET    
    
    xmlfile = ET.parse('file.xml')
    a = xmlfile.getroot()
    
    f = ET.Element('f')
    g = ET.SubElement(f,'g')
    
    f.tail = "\n    "
    f.text = "\n        "
    g.tail = "\n    "
    
    a.insert(1, f)
    
    print ET.tostring(a)
    

    Output:

    <a>
        <b>
            <c />
        </b>
        <f>
            <g />
        </f>
        <d>
            <e />
        </d>
    </a>