Skipping files if xml like part is missing

I am analyzing the xml data of several files. To get my data, I first need to split the xml data from the whole file to be able to work with it.

For this I use the split() method and search for <Data.

Here I run into a problem.

Some of the files have no xml data in them and therefore these files I would like to simply skip.

path = r"C:\Users\Nathan\Desktop\Test\*.xml"

for xml in glob.glob(path):
    with open(xml) as data_file
        file_content = data_file.read

        xml_part1 = file_content.split("<Data",1)[1] #here i get an Error if "<Data" is not in the file In
        xml_part2 = file_content.split("Data>",1)[0]
        xml_file = "<Data" + xml_part2+"Data>"

For help I would be very grateful

Solution

You can use a try-catch block to catch the exception when the split() method fails because there is no xml data in that file.

import glob

path = r"C:\Users\Nathan\Desktop\Test\*.xml"

for xml in glob.glob(path):
    with open(xml, 'r') as data_file:
        file_content = data_file.read()

        try:
            xml_part1 = file_content.split("<Data", 1)[1]
            xml_part2 = xml_part1.split("Data>", 1)[0]
            xml_data = "<Data" + xml_part2 + "Data>"
            
            # Do stuff with your xml data
            
        except IndexError:
            print(f"Skipping file {xml} as it does not contain <Data>")