Search code examples
pythonloopsoperating-systemelementtreexmlconvert

How to apply code to all the files in a directly and convert xml files to txt files


I'm trying to apply code in python to extract necessary information from each xml file and write them in txt files. I successfully executed my code but the result shows only one file. Also, my code doesn't look smart, I would be very glad if anyone can help me to solve the problem and give me some correction.

I'm afraid I'm very beginner so I might have many fundamental mistakes...

here is my code


from xml.etree import ElementTree as ET
import os

path = r'/Users/mo/Documents/annotations/xmls'
filenames = []

for filename in os.listdir(path):
    if not filename.endswith('.xml'):
        continue
    fullname = os.path.join(path,filename)
    filenames.append(fullname)

each_name = filename[:-4]

with open(each_name + '.txt', 'a') as f:

    for filename in filenames:
        tree = ET.parse(filename)
        root = tree.getroot()

        for object in root.findall('object'):
            categoryID = object.find('name').text

            for bnd_box in object.findall('bndbox'):
                Xcenter = (int(bnd_box.find('xmax').text) - int(bnd_box.find('xmin').text))/2
                Ycenter = (int(bnd_box.find('ymax').text) - int(bnd_box.find('ymin').text))/2
                width = int(bnd_box.find('xmax').text)- int(bnd_box.find('xmin').text)
                height = int(bnd_box.find('ymax').text) - int(bnd_box.find('ymin').text)
                Detection_rows = str(categoryID) + str(Xcenter) + str(Ycenter) + str(width) + str(height) + '\n'

    f.write(str(Detection_rows))

Thank you very much


Solution

  • Try this (did not test it)

    from xml.etree import ElementTree as ET
    import os
    
    path = r'/Users/mo/Documents/annotations/xmls'
    
    for filename in os.listdir(path):
        if not filename.endswith('.xml'):
            continue
            
        fullname = os.path.join(path, filename)
    
        with open(fullname[:-4] + '.txt', 'a') as f:
    
            tree = ET.parse(fullname)
            root = tree.getroot()
    
            for object in root.findall('object'):
                categoryID = object.find('name').text
    
                for bnd_box in object.findall('bndbox'):
                    Xcenter = (int(bnd_box.find('xmax').text) - int(bnd_box.find('xmin').text))/2
                    Ycenter = (int(bnd_box.find('ymax').text) - int(bnd_box.find('ymin').text))/2
                    width = int(bnd_box.find('xmax').text)- int(bnd_box.find('xmin').text)
                    height = int(bnd_box.find('ymax').text) - int(bnd_box.find('ymin').text)
                    Detection_rows = str(categoryID) + str(Xcenter) + str(Ycenter) + str(width) + str(height) + '\n'
    
            f.write(str(Detection_rows))
    

    Your for loop should be outside the open and you changed filenames for filename. Also, I removed one unnecessary for loop. Let me know if it solves the issue for you.