Search code examples
pythonxmlminidom

XML Tree parsing with condition in Python


Here is my XML structure:

<images>
  <image>
<name>brain tumer</name>
<location>images/brain_tumer1.jpg</location>
<annotations>
    <comment>
        <name>Patient 0 Brain Tumer</name>
        <description>
            This is a tumer in the brain
        </description>
    </comment>
</annotations>
</image>
<image>
<name>brain tumer</name>
<location>img/brain_tumer2.jpg</location>
<annotations>
    <comment>
        <name>Patient 1 Brain Tumer</name>
        <description>
            This is a larger tumer in the brain
        </description>
    </comment>
</annotations>
</image>
</images>

I am new to Python and wanted to know if retrieving the location data based on the comment:name data was posible? In other words here is my code:

for itr1 in itemlist :
            commentItemList = itr1.getElementsByTagName('name')

            for itr2 in commentItemList:
                if(itr2.firstChild.nodeValue == "Patient 1 Liver Tumer"):
                    commentName = itr2.firstChild.nodeValue
                    Loacation = it1.secondChild.nodeValue

Any recommendations or am i missing somthing here? Thank you in advance.


Solution

  • Parsing xml with minidom isn't fun at all, but here's the idea:

    • iterate over all image nodes
    • for each node, check comment/name text
    • if the text matches, get the location node's text

    Example that finds location for Patient 1 Brain Tumer comment:

    import xml.dom.minidom
    
    data = """
    your xml goes here
    """
    
    dom = xml.dom.minidom.parseString(data)
    for image in dom.getElementsByTagName('image'):
        comment = image.getElementsByTagName('comment')[0]
        comment_name_text = comment.getElementsByTagName('name')[0].firstChild.nodeValue
        if comment_name_text == 'Patient 1 Brain Tumer':
            location =  image.getElementsByTagName('location')[0]
            print location.firstChild.nodeValue
    

    prints:

    img/brain_tumer2.jpg