I have an xml doc that looks something like this
<MyXmlRoot>
<App xmlns='urn:SomethingSomething1'>
...
</App>
<User xmlns='urn:SomethingSomething2'>
...
</User>
<Doc xmlns='urn:SomethingSomething3'>
<level2>
<level3>
<level4>
<level5>
<level6>
<level7>
<level8>
<level9>
<level10>Content at the deepest level</level10>
</level9>
</level8>
</level7>
</level6>
</level5>
</level4>
</level3>
</level2>
</Doc>
I use lxml to read it and parse it like this
tree = etree.parse("textxml.xml")
root = tree.getroot()
if I do pretty print from root it will show the entire xml. which is good but when I try to read specific tags values like so
content = root.xpath('//level10/text()')
xpath can't find any tag below the root and returns empty list I suspect it's because of the namespaces but can't find a solution to make xpath read values any advice ?
Add xmlns {urn:SomethingSomething3}
to the tag you want to search:
from lxml import etree
xml_data = """
<MyXmlRoot>
<App xmlns='urn:SomethingSomething1'>
</App>
<User xmlns='urn:SomethingSomething2'>
</User>
<Doc xmlns='urn:SomethingSomething3'>
<level2>
<level3>
<level4>
<level5>
<level6>
<level7>
<level8>
<level9>
<level10>Content at the deepest level</level10>
</level9>
</level8>
</level7>
</level6>
</level5>
</level4>
</level3>
</level2>
</Doc>
</MyXmlRoot>
"""
root = etree.fromstring(xml_data)
level10_text = root.find(".//{urn:SomethingSomething3}level10").text
print("Text from <level10> tag:", level10_text)
Prints:
Text from <level10> tag: Content at the deepest level
OR: Use etree.ETXPath
:
to_search = etree.ETXPath("//{urn:SomethingSomething3}level10/text()")
level10_text = to_search(root)
print("Text from <level10> tag:", level10_text)
Prints:
Text from <level10> tag: ['Content at the deepest level']