Search code examples
pythonxmlpython-3.xxml-parsingminidom

Minidom - check if tag is present in XML


I have a script that goes through all the XML files in directory and then parses those XML files to get the data in element IS tag ICP. However, there are several thousands of those XML files and some of them may not have tag ICP in IS. Is there a way to do it via minidom?

Example of XML I am parsing that has element IS and tag ICP:

<is ico="0000000000" pcz="1" icp="12345678" icz="12345678" oddel="99">

Example of XML I am parsing that has element IS but no tag ICP:

<is ico="000000000">

Here my script obviously fails as there is no ICP. How to check presence of the ICP tag?

My script:

import os
from xml.dom import minidom

#for testing purposes
directory = os.getcwd()

print("Zdrojový adresář je: " + directory)
print("Procházím aktuální adresář, hledám XML soubory...")
print("Procházím XML soubory, hledám IČP provádějícího...")

with open ('ICP_all.txt', 'w') as SeznamICP_all:   
    for root, dirs, files in os.walk(directory):
        for file in files:
            if (file.endswith('.xml')):
                xmldoc = minidom.parse(os.path.join(root, file))
                itemlist = xmldoc.getElementsByTagName('is')
                SeznamICP_all.write(itemlist[0].attributes['icp'].value + '\n')

print("Vytvářím list unikátních IČP...")

with open ('ICP_distinct.txt','w') as distinct:
    UnikatniICP = []
    with open ('ICP_all.txt','r') as SeznamICP_all:
        distinct.writelines(set(SeznamICP_all))

input('Pro ukončení stiskni libovolnou klávesu...')

I googled a lot, yet I cannot get a simple answer on how to check if a tag is present in XML using minidom.

Could you please give me some advise?


Solution

  • You can use hasAttribute(attributeName) method :

    ....
    itemlist = xmldoc.getElementsByTagName('is')
    if itemlist[0].hasAttribute("icp"):
        SeznamICP_all.write(itemlist[0].attributes['icp'].value + '\n')