Search code examples
pythondataframeexport-to-csvopenpyxlxlsx

Export Dataframe to Excel Table


The simple code down below prints certain elements and their attributes in a dataframe. It iterates through an XML files, looks for these elements and just prints them out

Code

import xml.etree.ElementTree as ET
import pandas as pd
tree = ET.parse('1last.xml')
root = tree.getroot()

for neighbor in root.iter('Description'):
    print(neighbor.attrib, neighbor.text)
for neighbor in root.iter('SetData'):
    print(neighbor.attrib)
for neighbor in root.iter('FileX'):
    print(neighbor.attrib) 
for neighbor in root.iter('FileY'):
    print(neighbor.attrib)

Output

Here

I want to export the output into a Excel table form but It doesn’t seem to work I have tried this

export_excel = root.to_excel (r'C:\Users\fsdf.LAPTOP-E8A1PPIN\Desktop\test\export_dataframe.xlsx', index = None, header=True)

but I got the error saying “AttributeError: 'xml.etree.ElementTree.Element' object has no attribute 'to_excel'

This my xml file

<?xml version="1.0" encoding="utf-8"?>
<ProjectData>
<FINAL>
    <START id="ID0001" service_code="0x5196">
      <Docs Docs_type="START">
        <Rational>225196</Rational>
        <Qualify>6251960000A0DE</Qualify>
      </Docs>
      <Description num="1213f2312">The parameter</Description>
      <SetFile dg="" dg_id="">
        <SetData value="32" />
      </SetFile>
    </START>
    <START id="DG0003" service_code="0x517B">
      <Docs Docs_type="START">
        <Rational>23423</Rational>
        <Qualify>342342</Qualify>
      </Docs>
      <Description num="3423423f3423">The third</Description>
      <SetFile dg="" dg_id="">
        <FileX dg="" axis_pts="2" name="" num="" dg_id="" />
        <FileY unit="" axis_pts="20" name="TOOLS" text_id="23423" unit_id="" />
        <SetData x="E1" value="21259" />
        <SetData x="E2" value="0" />
      </SetFile>
    </START>
    <START id="ID0048" service_code="0x5198">
      <RawData rawdata_type="OPDATA">
        <Request>225198</Request>
        <Response>343243324234234</Response>
      </RawData>
      <Meaning text_id="434234234">The forth</Meaning>
      <ValueDataset unit="m" unit_id="FEDS">
        <FileX dg="kg" discrete="false" axis_pts="19" name="weight" text_id="SDF3" unit_id="SDGFDS" />
        <SetData xin="sdf" xax="233" value="323" />
        <SetData xin="123" xax="213" value="232" />
        <SetData xin="2321" xax="232" value="23" />
      </ValueDataset>
    </START>
</FINAL>
</ProjectData>

This is what I would want the table to look like. enter image description here


Solution

  • One approach would be to use a library such as openpyxl to write the Excel file directly. The following shows how this could be done:

    import openpyxl    
    from bs4 import BeautifulSoup
    
    
    with open('1last.xml') as f_input:
        soup = BeautifulSoup(f_input, 'lxml')
    
    wb = openpyxl.Workbook()
    ws = wb.active
    ws.title = "Sheet1"
    
    ws.append(["Description", "num", "text"])
    
    for description in soup.find_all("description"):
        ws.append(["", description['num'], description.text])
    
    ws.append(["SetData", "x", "value", "xin", "xax"])
    
    for setdata in soup.find_all("setdata"):
        ws.append(["", setdata.get('x', ''), setdata.get('value', ''), setdata.get('xin', ''), setdata.get('xax', '')])
    
    wb.save(filename="1last.xlsx")
    

    This would create an Excel file looking like:

    Excel screenshot