I am analyzing the xml data of several files. To get my data, I first need to split the xml data from the whole file to be able to work with it.
For this I use the split()
method and search for <Data
.
Here I run into a problem.
Some of the files have no xml data in them and therefore these files I would like to simply skip.
path = r"C:\Users\Nathan\Desktop\Test\*.xml"
for xml in glob.glob(path):
with open(xml) as data_file
file_content = data_file.read
xml_part1 = file_content.split("<Data",1)[1] #here i get an Error if "<Data" is not in the file In
xml_part2 = file_content.split("Data>",1)[0]
xml_file = "<Data" + xml_part2+"Data>"
For help I would be very grateful
You can use a try-catch block to catch the exception when the split()
method fails because there is no xml data in that file.
import glob
path = r"C:\Users\Nathan\Desktop\Test\*.xml"
for xml in glob.glob(path):
with open(xml, 'r') as data_file:
file_content = data_file.read()
try:
xml_part1 = file_content.split("<Data", 1)[1]
xml_part2 = xml_part1.split("Data>", 1)[0]
xml_data = "<Data" + xml_part2 + "Data>"
# Do stuff with your xml data
except IndexError:
print(f"Skipping file {xml} as it does not contain <Data>")