Search code examples
pythonxmlparsingelement

Parse xml for text of every specific tag not working


I am trying to gather every element <sequence-number> text into a list. Here is my code

#!/usr/bin/env python

from lxml import etree 

response = '''
<rpc-reply xmlns:nc="urn:ietf:params:xml:ns:netconf:base:1.0" xmlns="urn:ietf:params:xml:ns:netconf:base:1.0" message-id="urn:uuid:0d07cdf5-c8e5-45d9-89d1-92467ffd7fe4">
 <data>
  <ipv4-acl-and-prefix-list xmlns="http://cisco.com/ns/yang/Cisco-IOS-XR-ipv4-acl-cfg">
   <accesses>
    <access>
     <access-list-name>TESTTEST</access-list-name>
     <access-list-entries>
      <access-list-entry>
       <sequence-number>1</sequence-number>
       <remark>TEST</remark>
       <sequence-str>1</sequence-str>
      </access-list-entry>
      <access-list-entry>
       <sequence-number>10</sequence-number>
       <grant>permit</grant>
       <source-network>
        <source-address>10.10.5.0</source-address>
        <source-wild-card-bits>0.0.0.255</source-wild-card-bits>
       </source-network>
       <next-hop>
        <next-hop-type>regular-next-hop</next-hop-type>
        <next-hop-1>
         <next-hop>10.10.5.2</next-hop>
         <vrf-name>SANE</vrf-name>
        </next-hop-1>
       </next-hop>
       <sequence-str>10</sequence-str>
      </access-list-entry>
      <access-list-entry>
       <sequence-number>20</sequence-number>
       <grant>permit</grant>
       <source-network>
        <source-address>10.10.6.0</source-address>
        <source-wild-card-bits>0.0.0.255</source-wild-card-bits>
       </source-network>
       <next-hop>
        <next-hop-type>regular-next-hop</next-hop-type>
        <next-hop-1>
         <next-hop>10.10.6.2</next-hop>
         <vrf-name>VRFNAME</vrf-name>
        </next-hop-1>
       </next-hop>
       <sequence-str>20</sequence-str>
      </access-list-entry>
     </access-list-entries>
    </access>
   </accesses>
  </ipv4-acl-and-prefix-list>
 </data>
</rpc-reply>
'''
q = etree.fromstring(response)
print(q.findall('.//sequence-number'))

But get nothing for output. I have tried the following statements too with no luck:

print(q.findall('./sequence-number/'))
print(q.findall('sequence-number/'))
print(q.findall('.//sequence-number/'))
print(q.findall('sequence-number'))

How can I gather this data?


Solution

  • As mentioned in comments, xml namespaces should be considered. The most simple way to handle them would be using the xpath function instead of findall along with the slight modification of the search expression:

    print(q.xpath(".//*[local-name()='sequence-number']"))
    

    Here expression .//*[local-name()='sequence-number']contains the wildcard * with the predicate [local-name()='sequence-number']. It means that every child element should be select having local name (without namespace consideration) equal to "sequence-number".

    Another approach would be creation of a namespace map and passing it to the findall function:

    ns = {"ns":"http://cisco.com/ns/yang/Cisco-IOS-XR-ipv4-acl-cfg"}
    print(q.findall(".//ns:sequence-number", ns))