Search code examples
pythonxmlxls

xml file from excel sheet data


Hello using the code snippet below, I created an xml below, but I notice that: the parameter order I use in the code is different from that in the output i.e. <node y="-3749099.0" x="-45194.0" id="11542.0"/> should be <node id="11542.0" x="-45194.0" y="-3749099.0"/> also the output is not as the desired output below. Can someone advise how I can:

  • correct my code to get the correct output
  • extend the code so that if I had to use an excel file with many columns (more than 3) I wont have to hard code val[0], val[1], val[2] as in FIELD(id=str(val[0]), x=str(val[1]), y=str(val[2])),

Code Snippet:

import lxml.etree
import lxml.builder
import xlrd

wb = xlrd.open_workbook("emme_nodes1.xls")
sh = wb.sheet_by_index(0)
tags = [n.replace(" ", "").lower() for n in sh.row_values(0)]

for row in range(1, sh.nrows):
    val = sh.row_values(row)

    E = lxml.builder.ElementMaker()
    ROOT = E.network
    DOC = E.nodes
    FIELD = E.node
    my_doc = ROOT(
            DOC(
                FIELD(id=str(val[0]), x=str(val[1]), y=str(val[2])),
                )
            )
    print lxml.etree.tostring(my_doc, pretty_print=True)

Output:

<network>
  <nodes>
    <node y="-3748681.0" x="-45333.0" id="11543.0"/>
  </nodes>
</network>

<network>
  <nodes>
    <node y="-3747847.0" x="-44369.0" id="11540.0"/>
  </nodes>
</network>

<network>
  <nodes>
    <node y="-3748683.0" x="-45060.0" id="11541.0"/>
  </nodes>
</network>

<network>
  <nodes>
    <node y="-3750248.0" x="-45518.0" id="11546.0"/>
  </nodes>
</network>

<network>
  <nodes>
    <node y="-3750024.0" x="-45448.0" id="11547.0"/>
  </nodes>
</network>

<network>
  <nodes>
    <node y="-3749821.0" x="-44745.0" id="11544.0"/>
  </nodes>
</network>

<network>
  <nodes>
    <node y="-3750508.0" x="-45561.0" id="11545.0"/>
  </nodes>
</network>

<network>
  <nodes>
    <node y="-3750202.0" x="-45802.0" id="11548.0"/>
  </nodes>
</network>

<network>
  <nodes>
    <node y="-3749805.0" x="-45485.0" id="11549.0"/>
  </nodes>
</network>

Desired Output:

<network>
  <nodes>
    <node id="11542.0" x="-45194.0" y="-3749099.0"/>
    <node id="11543.0" x="-45333.0" y="-3748681.0"/>
    <node id="11540.0" x="-44369.0" y="-3747847.0"/>
    <node id="11541.0" x="-45060.0" y="-3748683.0"/>
    <node id="11546.0" x="-45518.0" y="-3750248.0"/>
    <node id="11547.0" x="-45448.0" y="-3750024.0"/>
    <node id="11544.0" x="-44745.0" y="-3749821.0"/>
    <node id="11545.0" x="-45561.0" y="-3750508.0"/>
    <node id="11549.0" x="-45485.0" y="-3749805.0"/>
    <node id="11548.0" x="-45802.0" y="-3750202.0"/>
    <node id="11549.0" x="-45485.0" y="-3749805.0"/>
  </nodes>
</network>

Excel sheet (emme_nodes1.xls):

 id         x           y
11542   -45194.0    -3749099.0
11543   -45333.0    -3748681.0
11540   -44369.0    -3747847.0
11541   -45060.0    -3748683.0
11546   -45518.0    -3750248.0
11547   -45448.0    -3750024.0
11544   -44745.0    -3749821.0
11545   -45561.0    -3750508.0
11548   -45802.0    -3750202.0
11549   -45485.0    -3749805.0

Solution

  • I finally arrived at this solution, and it works perfectly

    import xlrd
    from lxml import etree
    
    root = etree.Element('network')
    root.set('name', 'Network')
    tree = etree.ElementTree(root)
    name = etree.Element('nodes')
    root.append(name)   
    wb = xlrd.open_workbook("emme_nodes1.xls")
    sh = wb.sheet_by_index(0)
    
    for row in range(1, sh.nrows):
        val = sh.row_values(row)
        element = etree.SubElement(name, 'node')
        element.set('id', str(int(val[0])))
        element.set('x', str(val[1]))
        element.set('y', str(val[2]))
    print etree.tostring(root,pretty_print=True)