Search code examples
pythonxmldomkodi

Search for specific text in an element of XML with DOM (Python)


For a program in Python I am looking for a way to find a specific text in an element of XML and to find out which node number it is.

This is the xml:

-<shortcut>
<label>33060</label>
<label2>Common Shortcut</label2>
</shortcut>

-<shortcut>
<label>Test</label>
</shortcut>

Of course I know it is probably node number 2 in here, but the xml file can be longer.

This are to ways I tried it, but I don't get it to work properly:

xmldoc = minidom.parse("/DATA.xml")
Shortcut = xmldoc.getElementsByTagName("shortcut")
Label = xmldoc.getElementsByTagName("label")
print xmldoc.getElementsByTagName("label")[12].firstChild.nodeValue (works)
for element in Label:
  if  element.getAttributeNode("label") == 'Test':
  # if element.getAttributeNode('label') == "Test":
    print "element found"
else:
    print "element not found"

for node in xmldoc.getElementsByTagName("label"):
    if node.nodeValue == "Test":
        print "element found"
else:
    print "element not found"

Solution

  • This working example demonstrates one possible way to search element containing specific text using minidom module* :

    from xml.dom.minidom import parseString
    
    def getText(nodelist):
        rc = []
        for node in nodelist:
            if node.nodeType == node.TEXT_NODE:
                rc.append(node.data)
        return ''.join(rc)
    
    
    xml = """<root>
    <shortcut>
    <label>33060</label>
    <label2>Common Shortcut</label2>
    </shortcut>
    <shortcut>
    <label>Test</label>
    </shortcut>
    </root>"""
    xmldoc = parseString(xml)
    labels = xmldoc.getElementsByTagName("label")
    for label in labels:
        text = getText(label.childNodes)
        if text == "Test":
            print("node found : " + label.toprettyxml())
            break
    

    output :

    node found : <label>Test</label>
    

    *) getText() function taken from minidom documentation page.