Search code examples
pythonxmlxpathurllib2elementtree

Parsing XML in Python using Multiple Loops and ET


I am writing a script using Python 2.7 that talks to a firewall API which returns in XML format. I would like it to find all rules that meet certain conditions but I am having trouble parsing the XML.

Due to my environment being locked down I cannot use outside modules. So I am using urllib2 and ElementTree

XML (the actual XML is massive)

<response status="success" code="19">
  <result total-count="1" count="1">
    <security>
        <rules>
            <entry name="RULE 1">
                <source>
                    <member>169.254.0.1</member>
                    <member>169.254.0.2</member>
                </source>
                <destination>
                    <member>any</member>
                </destination>                  
            </entry>
            <entry name="RULE 2">
                <source>
                    <member>169.254.0.3</member>
                    <member>169.254.0.4</member>
                </source>
                <destination>
                    <member>192.168.1.1</member>
                </destination>                  
            </entry>
        </rules>
    </security>
</result>

I want to find out if the source and destination of any of the firewalls is equal to "any." Then I want to report which rule meets this condition.

I've written this to find all the rules

import urllib2 
import xml.etree.ElementTree as ET

url = "https://MyFirewall/api"

response = urllib2.urlopen(url) 
html = response.read() 
contents = ET.fromstring(html)

#Get the list of rules
rules = []
for item in contents.findall('./result/security/rules/entry'):
    rules.append(item.attrib['name'])

My thoughts at this point was to use this "rules" list to specify the XPATH of "./result/security/rules/entry/rules[x]" or something similar (probably have to use @name). And then search the source and destination nodes with an If condition what I am looking for. This way I could associate the rule name with source and destination.

I then realized that there is probably a much simpler way and thought I should ask here.

Thank you


Solution

  • I was able to figure it out.

    What I was missing was a way to index the query by using "[@name='VARIABLE']" in the XML search.

    import urllib2 
    import xml.etree.ElementTree as ET
    
    url = "https://MyFirewall/api"
    
    response = urllib2.urlopen(url) 
    html = response.read() 
    contents = ET.fromstring(html)
    
    rules = []
    
    for item in contents.findall('./result/security/rules/entry'):
        rules.append(item.attrib['name'])
    
    for rule in rules:
            for item in contents.findall(".//*[@name='" + rule + "']/source/member"): #This line is what I was missing
                    source = item.text
    
            for item in contents.findall(".//*[@name='" + rule + "']/destination/member"): #This line is what I was missing
                    destination = item.text                 
    
            if ("any" in source) or ("any" in destination):
                    print "Rule " + rule + "in Device Group : " + device + "contains an ANY" 
                    #do stuff