Search code examples
pythonxmltreeelementtreegraphml

Etree returns a "random" string instead of attribute name


I am new to python and trees at all, and have encountered some problems.

I have the following dataset structured as:

<?xml version="1.0" encoding="UTF-8"?>
<graphml xmlns="http://graphml.graphdrawing.org/xmlns">
  <node id="someNode">
    <data key="label">someNode</data>
  </node>
</graphml>

I want to reach the attribute and attribute values for both the root and the node elements.

I have tried using Python xml.etree.ElementTree like this:

import xml.etree.ElementTree as etree

tree = etree.parse('myDataset')
root = tree.getroot()

print('Root: ', root)

print('Children: ', root.getchildren())

but this is what I get:

Root:  <Element '{http://graphml.graphdrawing.org/xmlns}graphml' at 0x031DB5A0>
Children:  [<Element '{http://graphml.graphdrawing.org/xmlns}key' at 0x03F9BFC0>

I did also try .text and .tag, which only removed the "at 0x03...".

Hope the question is understandable and someone know a solution.


Solution

  • If you want to output your root and children nodes as xml text, instead of the default representation, use xml.etree.ElementTree.tostring(root) and

    for child in root:
        xml.etree.ElementTree.tostring(child)
    

    official doc here: https://docs.python.org/2/library/xml.etree.elementtree.html#xml.etree.ElementTree.tostring

    And if you want he tag name, use the tag property of each element:

    print(root.tag)
    for child in root:
        print(child.tag)
    

    doc describing the available attributes: https://docs.python.org/2/library/xml.etree.elementtree.html#xml.etree.ElementTree.Element