The simple code down below prints certain elements and their attributes in a dataframe. It iterates through an XML files, looks for these elements and just prints them out
Code
import xml.etree.ElementTree as ET
import pandas as pd
tree = ET.parse('1last.xml')
root = tree.getroot()
for neighbor in root.iter('Description'):
print(neighbor.attrib, neighbor.text)
for neighbor in root.iter('SetData'):
print(neighbor.attrib)
for neighbor in root.iter('FileX'):
print(neighbor.attrib)
for neighbor in root.iter('FileY'):
print(neighbor.attrib)
Output
I want to export the output into a Excel table form but It doesn’t seem to work I have tried this
export_excel = root.to_excel (r'C:\Users\fsdf.LAPTOP-E8A1PPIN\Desktop\test\export_dataframe.xlsx', index = None, header=True)
but I got the error saying “AttributeError: 'xml.etree.ElementTree.Element' object has no attribute 'to_excel'
This my xml file
<?xml version="1.0" encoding="utf-8"?>
<ProjectData>
<FINAL>
<START id="ID0001" service_code="0x5196">
<Docs Docs_type="START">
<Rational>225196</Rational>
<Qualify>6251960000A0DE</Qualify>
</Docs>
<Description num="1213f2312">The parameter</Description>
<SetFile dg="" dg_id="">
<SetData value="32" />
</SetFile>
</START>
<START id="DG0003" service_code="0x517B">
<Docs Docs_type="START">
<Rational>23423</Rational>
<Qualify>342342</Qualify>
</Docs>
<Description num="3423423f3423">The third</Description>
<SetFile dg="" dg_id="">
<FileX dg="" axis_pts="2" name="" num="" dg_id="" />
<FileY unit="" axis_pts="20" name="TOOLS" text_id="23423" unit_id="" />
<SetData x="E1" value="21259" />
<SetData x="E2" value="0" />
</SetFile>
</START>
<START id="ID0048" service_code="0x5198">
<RawData rawdata_type="OPDATA">
<Request>225198</Request>
<Response>343243324234234</Response>
</RawData>
<Meaning text_id="434234234">The forth</Meaning>
<ValueDataset unit="m" unit_id="FEDS">
<FileX dg="kg" discrete="false" axis_pts="19" name="weight" text_id="SDF3" unit_id="SDGFDS" />
<SetData xin="sdf" xax="233" value="323" />
<SetData xin="123" xax="213" value="232" />
<SetData xin="2321" xax="232" value="23" />
</ValueDataset>
</START>
</FINAL>
</ProjectData>
One approach would be to use a library such as openpyxl
to write the Excel file directly. The following shows how this could be done:
import openpyxl
from bs4 import BeautifulSoup
with open('1last.xml') as f_input:
soup = BeautifulSoup(f_input, 'lxml')
wb = openpyxl.Workbook()
ws = wb.active
ws.title = "Sheet1"
ws.append(["Description", "num", "text"])
for description in soup.find_all("description"):
ws.append(["", description['num'], description.text])
ws.append(["SetData", "x", "value", "xin", "xax"])
for setdata in soup.find_all("setdata"):
ws.append(["", setdata.get('x', ''), setdata.get('value', ''), setdata.get('xin', ''), setdata.get('xax', '')])
wb.save(filename="1last.xlsx")
This would create an Excel file looking like: