Search code examples
pythonxmlelementtree

How To conserve XML declaration from read file using xml.etree.ElementTree


I am reading an xml file, adding some tags and writing it.

The file i read have <?xml version="1.0" encoding="UTF-8" standalone="yes"?> my output only has <?xml version="1.0" ?>

I use the following Code

import os
from xml.dom import minidom
import xml.etree.ElementTree as ET

    tree = ET.parse(xml_file)
    root = tree.getroot()
    access = ""

    # ... (rest of the processing logic)

    # Write to a temporary string to control indentation
    rough_string = ET.tostring(root, 'utf-8')
    reparsed = minidom.parseString(rough_string)

    # Write the formatted XML to the original file without empty lines and version information
    with open(xml_file, 'w', encoding='utf-8') as f:
        for line in reparsed.toprettyxml(indent="  ").splitlines():
            if line.strip():
                f.write(line + '\n')

How can i preserve the XML declaration from my original document?

Edit:

I solved it by manually adding the line

    with open(xml_file, 'w', encoding='utf-8') as f:
        custom_line = '<?xml version="1.0" encoding="UTF-8"  standalone="yes"?>'
        f.write(custom_line + '\n')
        for line in reparsed.toprettyxml(indent="  ").splitlines():
            if line.strip() and not line.startswith('<?xml'):
                f.write(line + '\n')

Solution

  • I solved it by adding this lines

    with open(xml_file, 'w', encoding='utf-8') as f:
        custom_line = '<?xml version="1.0" encoding="UTF-8"  standalone="yes"?>'
        f.write(custom_line + '\n')
        for line in reparsed.toprettyxml(indent="  ").splitlines():
            if line.strip() and not line.startswith('<?xml'):
                f.write(line + '\n')