Search code examples
pythonexcelxml

How to convert an XML file to an Excel file?


I have a directory which contains multiple XML files, lets say it contains the following 2:

<Record>
        <RecordID>Madird01</RecordID>
        <Location>Madird</Location>
        <Date>07-09-2020</Date>
        <Time>07u43m55s</Time>
        <Version>2.0.1</Version>
        <Version_2>v1.9</Version_2>
    <Max_30e>
        <I_25Hz_1s>56.40</I_25Hz_1s>
        <I_25Hz_2s>7.44</I_25Hz_2s>
    </Max_30e>
    <Max_30e>
        <I_75Hz_1s>1.56</I_75Hz_1s>
        <I_75Hz_2s>0.36</I_75Hz_2s>
    </Max_30e>
</Record>

And:

<Record>
        <RecordID>London01</RecordID>
        <Location>London</Location>
        <Date>07-09-2020</Date>
        <Time>08u53m45s</Time>
        <Version>2.0.1</Version>
        <Version_2>v1.9</Version_2>
    <Max_30e>
        <I_25Hz_1s>56.40</I_25Hz_1s>
        <I_25Hz_2s>7.44</I_25Hz_2s>
    </Max_30e>
    <Max_30e>
        <I_75Hz_1s>1.56</I_75Hz_1s>
        <I_75Hz_2s>0.36</I_75Hz_2s>
    </Max_30e>
</Record>

Now I want to convert this to an excel file which shows every XML file in horizontal order like this: enter image description here

I tried to convert the XML to csv string first and then to Excel but I got stuck, there should be easier ways.

This is my current code:

import xml.etree.ElementTree as ET
import os

xml_root = r'c:\data\Desktop\Blue\XML-files'

for file in os.listdir(xml_root):
    xml_file_path = os.path.join(xml_root, file)
    
    tree = ET.parse(xml_file_path)
    root = tree.getroot()
    tree = ET.ElementTree(root)

    for child in root:
        mainlevel = child.tag
        xmltocsv = ''
        for elem in root.iter():
            if elem.tag == root.tag:
                continue
            if elem.tag == mainlevel:
                xmltocsv = xmltocsv + '\n'
            xmltocsv = xmltocsv + str(elem.tag).rstrip() + str(elem.attrib).strip() + ';' + str(elem.text).rstrip() + ';'

Solution

  • create a csv file which is Excel friendly format.

    import xml.etree.ElementTree as ET
    from os import listdir
    
    
    xml_lst = [f for f in listdir() if f.startswith('xml')]
    fields = ['RecordID','I_25Hz_1s','I_75Hz_2s'] # TODO - add rest of the fields
    with open('out.csv','w') as f:
      f.write(','.join(fields) + '\n')
      for xml in xml_lst:
        root = ET.parse(xml)
        values = [root.find(f'.//{f}').text for f in fields]
        f.write(','.join(values) + '\n')
    

    output

    RecordID,I_25Hz_1s,I_75Hz_2s
    Madird01,56.40,0.36
    London01,56.40,0.36