Search code examples
pythonxmlelementtreeparse-tree

How to remove specific attributes from an ElementTree


I'm trying to remove all lines from an xml file that have one of these two forms:

<attr key="filename"><string>[SOME_FILENAME]</string></attr>
<attr key="line_number"><integer>[SOME_NUMBER]</integer></attr>

Right now my code looks like this:

for parent in tree.iter():
    for child in parent:
           if 'key' in child.attrib:
                   if child.attrib['key'] == 'phc.filename':
                           del child.attrib['key']
                   elif child.attrib['key'] == 'phc.line_number':
                           del child.attrib['key']

But the output isn't what I want, it's changing this:

<attr key="filename"><string>[SOME_FILENAME]</string></attr>
<attr key="line_number"><integer>[SOME_NUMBER]</integer></attr>

into this

<attr><string>[SOME_FILENAME]</string></attr>
<attr><integer>[SOME_NUMBER]</integer></attr>

When I'd rather have both of those lines gone altogether.

I've also tried replacing the "del child.attrib['key']" lines with parent.remove(child) but that doesn't work the way I tried it either.


Solution

  • That is because you are only removing attributes, not the elements themselves

    Try:

        dict = {}
        for parent in tree.iter():
            for child in parent:
                   if 'key' in child.attrib:
                           if child.attrib['key'] == 'phc.filename':
                                   dict[child] = parent
                           elif child.attrib['key'] == 'phc.line_number':
                                   dict[child] = parent
    
        for child in dict:
            parent = dict[child]
            parent.remove(child)