Search code examples
pythonxmlxpathatlassian-crowd

Parse XML to get a list or dictionary of data values from crowd API


I have this sample XML response from Atlassian Crowd Rest API and the real one will be a lot bigger. I am trying to parse the XML using Python ElementTree.

Here is the XML file named crowd.xml

<?xml version='1.0' encoding='utf-8'?>
<memberships>
 <membership group="crowd-administrators">
  <users>
   <user name="admin" />
   <user name="[email protected]" />
  </users>
  <groups>
  </groups>
 </membership>
 <membership group="developers">
  <users>
   <user name="johns" />
   <user name="millers" />
   <user name="peeryj" />
  </users>
  <groups>
  </groups>
 </membership>
</memberships>

In this API response from Atlassian Crowd, I need to extract the list of all the group names like crowd-administrators and developers. I need a list or dictionary of all the user names in each group. I also need to list all the users in a particular group as well.

I am trying to use XPath but am unable to get the values of the group name and user names.

def parseXML(xmlfile):
    tree = ET.parse(xmlfile)
    root = tree.getroot()
    users = tree.findall(".//user[@name='admin']")
    print(users)

parseXML("crowd.xml")

This doesn't print anything.

I was able to printout the entire XML with ET.fromstring

def parseXML2():
    url = 'http://localhost:8095/crowd/rest/usermanagement/latest/group/membership'
    response = requests.get(url, auth=("app-name", "passwd"))
    xml_response = ET.fromstring(response.content)
    print(xml_response)

parseXML2()

I would have used JSON output for this, but this API doesn't support JSON output. Not sure how I can extract the group names and users in them. Any help with extracting the data is appreciated. TIA


Solution

  • I need to extract the list of all the group names like crowd-administrators and developers. I need a list or dictionary of all the user names in each group.

    Something like the below

    import xml.etree.ElementTree as ET
    from collections import defaultdict
    
    xml = '''<?xml version='1.0' encoding='utf-8'?>
    <memberships>
     <membership group="crowd-administrators">
      <users>
       <user name="admin" />
       <user name="[email protected]" />
      </users>
      <groups>
      </groups>
     </membership>
     <membership group="developers">
      <users>
       <user name="johns" />
       <user name="millers" />
       <user name="peeryj" />
      </users>
      <groups>
      </groups>
     </membership>
    </memberships>
    '''
    
    data = defaultdict(list)
    root = ET.fromstring(xml)
    for m in root.findall('.//membership'):
      group_name = m.attrib['group']
      for u in m.findall('.//user'):
          data[group_name].append(u.attrib['name'])
    print(data)
    

    output

    defaultdict(<class 'list'>, {'crowd-administrators': ['admin', '[email protected]'], 'developers': ['johns', 'millers', 'peeryj']})