I am trying to parse an XML file into a TXT file. This is what my XML file looks like:
<annotation>
<folder>training</folder>
<filename>106310488.jpg</filename>
<source>
<database>synthetic initialization</database>
<annotation>PASCAL VOC2007</annotation>
<image>synthetic</image>
<flickrid>none</flickrid>
</source>
<owner>
<flickrid>none</flickrid>
<name>none</name>
</owner>
<size>
<width>1024</width>
<height>681</height>
<depth>3</depth>
</size>
<segmented>0</segmented>
<object>
<name>shell</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>234</xmin>
<ymin>293</ymin>
<xmax>281</xmax>
<ymax>340</ymax>
</bndbox>
</object>
<object>
<name>shell</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>504</xmin>
<ymin>302</ymin>
<xmax>551</xmax>
<ymax>349</ymax>
</bndbox>
</object>
<object>
<name>shell</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>776</xmin>
<ymin>302</ymin>
<xmax>823</xmax>
<ymax>349</ymax>
</bndbox>
</object>
</annotation>
The information that I am interested in are within <object>
. I want to get the <name>
and everything inside <bndbox>
. These are the names and bounding box coordinates of objects in a dataset. I don't know <object>
entries there are with <bndbox>
in each XML file so I want to write a logic that gets all of them.
So far, what my logic does is to get and process only the 1st occurrence of <object><bndbox></bndbox></object>
. If there are any other bounding box coordinates inside the XML file, my code simply skips it. I don't want this. Here is my code:
for annotations_file in annotations_dir:
annotations = []
milliseconds = int(time() * 1000)
doc = ET.parse('/content/darknet/logorec/openlogo/Annotations/' + annotations_file) # Parsing the XML file
new_annotations_file_name = annotations_file.split('.')[0] # Getting the name of the XML file without the file extension
canvas = cv2.imread('/content/darknet/logorec/openlogo/JPEGImages/' + new_annotations_file_name + '.jpg') # Get the entire image
canvas_shape = canvas.shape # Get the dimensions of the image
root = doc.getroot() # Gets the root of the XML file
annotations_box = root[6][4] # Gets the bounding box coordinates from the XML file
class_name = root[6][0] # Name of the object within the bounding box
class_name = class_name.text # Getting the text value
for ant in annotations_box:
annotations.append(ant.text) # Appending every sindle bounding box coordinate to an empty list
''' These are my annotations calculations for the YOLO model'''
logo_shape_w = int(annotations[2]) - int(annotations[0])
logo_shape_h = int(annotations[3]) - int(annotations[1])
x1 = int(annotations[0]) # x1 = xmin
y1 = int(annotations[3]) # y1 = ymax
x2 = x1 + logo_shape_w
y2 = y1 + logo_shape_h
w = x2 - x1
h = y2 - y1
center_x = x1 + (w/2)
center_y = y1 + (h/2)
x = center_x / canvas_shape[0]
y = center_y / canvas_shape[1]
width = w / canvas_shape[0]
height = h / canvas_shape[1]
'''---------------------------------------------------------'''
Parsing the XML with xpath it could be possible to iterate over objList items. Only the first item shown
>>> from lxml import etree
>>> tree = etree.parse('test.xml')
>>> objList = tree.xpath('//object')
>>> bnd = objList[0].xpath('name | bndbox/*')
>>> for e in bnd:
... e.text
...
'shell'
'234'
'293'
'281'
'340'
Iterating all objects
>>> for obj in objList:
... bnd = obj.xpath('name | bndbox/*')
... for e in bnd:
... e.text
...
'shell'
'234'
'293'
'281'
'340'
'shell'
'504'
'302'
'551'
'349'
'shell'
'776'
'302'
'823'