I have the following code which works:
import xml.etree.ElementTree as etree
def get_path(self):
parent = ''
path = self.tag
sibs = self.parent.findall(self.tag)
if len(sibs) > 1:
path = path + '[%s]'%(sibs.index(self)+1)
current_node = self
while True:
parent = current_node.parent
if not parent:
break
ptag = parent.tag
path = ptag + '/' + path
current_node = parent
return path
etree._Element.get_path = get_path
etree._Element.parent = None
class XmlDoc(object):
def __init__(self):
self.root = etree.Element('root')
self.doc = etree.ElementTree(self.root)
def SubElement(self, parent, tag):
new_node = etree.SubElement(parent, tag)
new_node.parent = parent
return new_node
doc = XmlDoc()
a1 = doc.SubElement(doc.root, 'a')
a2 = doc.SubElement(doc.root, 'a')
b = doc.SubElement(a2, 'b')
print etree.tostring(doc.root), '\n'
print 'element:'.ljust(15), a1
print 'path:'.ljust(15), a1.get_path()
print 'parent:'.ljust(15), a1.parent, '\n'
print 'element:'.ljust(15), a2
print 'path:'.ljust(15), a2.get_path()
print 'parent:'.ljust(15), a2.parent, '\n'
print 'element:'.ljust(15), b
print 'path:'.ljust(15), b.get_path()
print 'parent:'.ljust(15), b.parent
Which results in this output:
<root><a /><a><b /></a></root>
element: <Element a at 87e3d6c>
path: root/a[1]
parent: <Element root at 87e3cec>
element: <Element a at 87e3fac>
path: root/a[2]
parent: <Element root at 87e3cec>
element: <Element b at 87e758c>
path: root/a/b
parent: <Element a at 87e3fac>
Now this is drastically changed from the original code, but I'm not allowed to share that.
The functions aren't too inefficient but there is a dramatic performance decrease when switching from cElementTree to ElementTree which I expected, but from my experiments it seems like monkey patching cElementTree is impossible so I had to switch.
What I need to know is whether there is either a way to add a method to cElementTree or if there is a more efficient way of doing this so I can gain some of my performance back.
Just to let you know I am thinking of as a last resort implementing selected static typing and to compile with cython, but for certain reasons I really don't want to do that.
Thanks for taking a look.
EDIT: Sorry for the wrong use of the term late binding. Sometimes my vocabulary leaves something to be desired. What I meant was "monkey patching."
EDIT: @Corley Brigman, Guy: Thank you very much for your answers which do address the question, however (and I should have stated this in the original post) I had completed this project before using lxml which is a wonderful library that made coding a breeze but due to new requirements (This needs to be implemented as an addon to a product called Splunk) which ties me to the python 2.7 interpreter shipped with Splunk and eliminates the possibility of adding third party libraries with the exception of django.
If you need parents, use lxml instead - it tracks parents internally, and is still C behind the scenes so it's very fast.
However... be aware that there is a tradeoff in tracking parents, in that a given node can only have a single parent. This isn't usually a problem, however, if you do something like the following, you will get different results in cElementTree vs. lxml:
p = Element('x')
q = Element('y')
r = SubElement(p, 'z')
q.append(r)
cElementTree:
dump(p)
<x><z /></x>
dump(q)
<y><z /></y>
lxml:
dump(p)
<x/>
dump(q)
<y>
<z/>
</y>
Since parents are tracked, a node can only have one parent, obviously. As you can see, the element r
is copied to both trees in cElementTree, and reparented/moved in lxml.
There are probably only a small number of use cases where this matters, but something to keep in mind.