I am trying to parse an XML file into a TXT file. This is what my XML file looks like:
<database>synthetic initialization</database>
<annotation>PASCAL VOC2007</annotation>
The information that I am interested in are within <object>
. I want to get the <name>
and everything inside <bndbox>
. These are the names and bounding box coordinates of objects in a dataset. I don't know <object>
entries there are with <bndbox>
in each XML file so I want to write a logic that gets all of them.
So far, what my logic does is to get and process only the 1st occurrence of <object><bndbox></bndbox></object>
. If there are any other bounding box coordinates inside the XML file, my code simply skips it. I don't want this. Here is my code:
for annotations_file in annotations_dir:
annotations = []
milliseconds = int(time() * 1000)
doc = ET.parse('/content/darknet/logorec/openlogo/Annotations/' + annotations_file) # Parsing the XML file
new_annotations_file_name = annotations_file.split('.')[0] # Getting the name of the XML file without the file extension
canvas = cv2.imread('/content/darknet/logorec/openlogo/JPEGImages/' + new_annotations_file_name + '.jpg') # Get the entire image
canvas_shape = canvas.shape # Get the dimensions of the image
root = doc.getroot() # Gets the root of the XML file
annotations_box = root[6][4] # Gets the bounding box coordinates from the XML file
class_name = root[6][0] # Name of the object within the bounding box
class_name = class_name.text # Getting the text value
for ant in annotations_box:
annotations.append(ant.text) # Appending every sindle bounding box coordinate to an empty list
''' These are my annotations calculations for the YOLO model'''
logo_shape_w = int(annotations[2]) - int(annotations[0])
logo_shape_h = int(annotations[3]) - int(annotations[1])
x1 = int(annotations[0]) # x1 = xmin
y1 = int(annotations[3]) # y1 = ymax
x2 = x1 + logo_shape_w
y2 = y1 + logo_shape_h
w = x2 - x1
h = y2 - y1
center_x = x1 + (w/2)
center_y = y1 + (h/2)
x = center_x / canvas_shape[0]
y = center_y / canvas_shape[1]
width = w / canvas_shape[0]
height = h / canvas_shape[1]
Parsing the XML with xpath it could be possible to iterate over objList items. Only the first item shown
>>> from lxml import etree
>>> tree = etree.parse('test.xml')
>>> objList = tree.xpath('//object')
>>> bnd = objList[0].xpath('name | bndbox/*')
>>> for e in bnd:
... e.text
Iterating all objects
>>> for obj in objList:
... bnd = obj.xpath('name | bndbox/*')
... for e in bnd:
... e.text