Search code examples
pythonxmlparsingelementtreeminidom

Parse XML to file with array


I need to generate the file below. Using python to parse the XML sample:

Sample XML

<fruits>
<tag beginTime="20181125020000" endTime="20181202020000">
<EventId>16778</EventId>
    <item color="red">
        <name>apple</name>
        <count>1</count>
        <subtag>
            <Info name="Eid">396</Info>
            <Info name="New">397</Info>
        </subtag>
    </item>
    <item color="yellow">
        <name>banana</name>
        <count>2</count>
        <subtag>
            <Info name="Eid">500</Info>
            <Info name="New">650</Info>
            <Info name="Col">999</Info>
        </subtag>
    </item>
</tag>  

Desired Output:

20181125020000|20181202020000|16778|red|apple|1|Eid;396;New;397|
20181125020000|20181202020000|16778|yelow|banana|1|Eid;500;New;650;Col;999|

Solution

  • Another way to do it is to convert XML to json:

    import xmltodict
    
    with open('file.xml') as f:
        d = xmltodict.parse(f.read())['fruits']['tag']
    
    for i in d['item']:
        subtag = []
        for s in i['subtag']['Info']:
            subtag.append('{};{}'.format(s['@name'], s['#text']))
        print('{}|{}|{}|{}|{}|{}|{}|'.format(d['@beginTime'], d['@endTime'], d['EventId'], i['@color'], i['name'], i['count'], ';'.join(subtag)))
    

    Output:

    20181125020000|20181202020000|16778|red|apple|1|Eid;396;New;397|
    20181125020000|20181202020000|16778|yellow|banana|2|Eid;500;New;650;Col;999|