Search code examples
pythonxmlparsingelementtree

Find and remove sub-element in XML file


I’m new to Python so here is my problem:

XML:

<Configuration>
   <ConfiguredPaths>
       <ConfiguredPath>
           <LocalPath>C:\Temp</LocalPath>
           <EffectivePath>\\SERVERNAME\C$\Temp</EffectivePath>
       </ConfiguredPath>
       <ConfiguredPath>
           <LocalPath>C:\Files</LocalPath>
           <EffectivePath>\\SERVERNAME\C$\Files</EffectivePath>
       </ConfiguredPath>
       <ConfiguredPath>
           <LocalPath>C:\DOCS</LocalPath>
           <EffectivePath>\\SERVERNAME\C$\DOCS</EffectivePath>
       </ConfiguredPath>
   </ConfiguredPaths>
</Configuration>

What I need to be able to do is locate the element "EffectivePath" if it equals a certain value then delete the whole section it belongs to. Since is a child of "ConfiguredPath" (the section that needs to be deleted related to onlt that particular effective path)

Here is an example result if EffectivePath = "\SERVERNAME\C$\DOCS"

=> Result XML file should be as folllows:

<Configuration>
   <ConfiguredPaths>
       <ConfiguredPath>
           <LocalPath>C:\Temp</LocalPath>
           <EffectivePath>\\SERVERNAME\C$\Temp</EffectivePath>
       </ConfiguredPath>
       <ConfiguredPath>
           <LocalPath>C:\Files</LocalPath>
           <EffectivePath>\\SERVERNAME\C$\Files</EffectivePath>
       </ConfiguredPath>
   </ConfiguredPaths>
</Configuration>

Here is my script; however it removes all the ConfiguredPaths (and hence its children) rather that just the required one:

import xml.etree.ElementTree as ET
tree = ET.parse('Data.xml')
root = tree.getroot()

for child in root:
    if child.tag == "ConfiguredPaths":
        for elem in child.iter():
            if elem.tag == "ConfiguredPath":
                for child_elem in child.iter():
                    if child_elem.tag == "EffectivePath" and child_elem.text == r"\\SERVERNAME\C$\DOCS":
                        print(f"Required item is:", child_elem.tag, child_elem.text)
                        root.remove(child)

tree.write('output.xml') 

Solution

  • One more example using lxml instead of ElementTree (because ElementTree has limited support for xpath and also lxml has the convenient .getparent() method).

    from lxml import etree
    
    to_remove = r"\\SERVERNAME\C$\DOCS"
    
    tree = etree.parse("Data.xml")
    
    # The context for tree is already /Configuration, so using a relative xpath.
    for elem in tree.xpath(f"./ConfiguredPaths/ConfiguredPath[EffectivePath='{to_remove}']"):
        elem.getparent().remove(elem)
    
    tree.write("output.xml")