Search code examples
pythonxmlelementtree

ElementTree : extract attribute value using findall and append value to list


Wanted to extract the value of name attribute from inter tag and also append the group name with the name tag if group value is present. I tried to extract using xml.etree.ElementTree but my code is not giving expected output.

Input XML

<abtshop>
    <dDirectory>dub</dDirectory>
    <S>statusd</S>
    <work>worklogs</work>
    <custs>
        <cust>nim-us</cust>
    </custs>

    <mileage>999</mileage>

    <defaults>
        <default type="mercley">
            <user>dairy</user>
            <exec>slm.sh</exec>
            <env>
                <var name="SAN_HOME">youyou-11</var>
            </env>
        </default>
    </defaults>
    <inters>
        <inter name="nim_turk" first-day="20230301" historical="20220103" market="multi">
            <works>
                <work kind="obopay" run="jbs">
                    <args>
                        <arg name="distance">180000</arg>
                    </args>
                </work>
                <work kind="silkb" run="jbs">
                    <args>
                        <arg name="distance">180000</arg>
                    </args>
                </work>
            </works>
        </inter>
        <inter name="nim_us_m" first-day="20230301" historical="20220103" market="lone">
            <works>
                <work kind="obopay" run="jbs" groups="groupA,groupB">
                    <args>
                        <arg name="distance">120000</arg>
                        <arg name="jbsopt">xmas_size=1200000</arg>
                        <arg name="jbsopt">of_obopaying_threads=2</arg>
                    </args>
                </work>
                <work kind="silkb" run="jbs" groups="groupA,groupB">
                    <args>
                        <arg name="distance">120000</arg>
                        <arg name="jbsopt">xmas_size=1200000</arg>
                    </args>
                </work>
            </works>
        </inter>
    </inters>
</abtshop>

Required Logic

If work tag has attribute groups then need value with name and group appended else print only the name.

  <inter name="nim_us_m" first-day="20230301" historical="20220103" market="lone">
    <works>
        <work kind="obopay" run="jbs" groups="groupA,groupB">

like

nim_us_m-groupA, nim_us_m-groupB

else, print only the name

<inter name="nim_turk" first-day="20230301" historical="20220103" market="multi">

like this

nim_turk

I have tried below code, to extract the value but vain.

Tried Code

tree=ET.parse('test.xml')
root = tree.getroot()
xm_subs=[]
for subn in root.findall(".//inter/works/work[@run='jbs'][@kind='obopay']/../.."):
        sname=subn.attrib["name"]
        for subg in root.findall(".//inter[@name='%s']/jobs/job[@run='jbs'][@kind='obopay'][@groups]" % sname):
                        groups=subg.attrib['groups']
                        for gname in groups.split(","):
                                sub_name=subn.attrib["name"] + "-" + gname
                                xm_subs.append(sub_name)

        else:
                xm_subs.append(subn.attrib["name"])
print xm_subs

Required Output

['nim_turk','nim_us_m-groupA','nim_us_m-groupB']

Solution

  • This should show you the required list:

    import xml.etree.ElementTree as ET
    
    root = ET.parse("test_xml.xml").getroot()
    
    result =[]
    user = ''
    for elem in root.findall('.//inters'):
        for name in elem:
            user = name.get('name')
            gr = name.find(".//work[@groups]")
            if gr is not None:
                l = gr.get('groups').split(',')
                for u_gr in l:
                    comb_usr = user + '_' + u_gr
                    result.append(comb_usr)
            else:
                result.append(user)
    
    print(result)