Search code examples
pythonxmlelementtreefindall

Get items from xml Python


I have an xml in python, need to obtain the elements of the "Items" tag in an iterable list.

I need get a iterable list from this XML, for example like it:

  • Item 1: Bicycle, value $250, iva_tax: 50.30
  • Item 2: Skateboard, value $120, iva_tax: 25.0
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<data>
    <info>Listado de items</info>
    <detalle>
        <![CDATA[<?xml version="1.0" encoding="UTF-8"?>
        <tienda id="tiendaProd" version="1.1.0">
            <items>
                <item>
                    <nombre>Bicycle</nombre>
                    <valor>250</valor>
                    <data>
                        <tax name="iva" value="50.30"></tax>
                    </data>
                </item>
                <item>
                    <nombre>Skateboard</nombre>
                    <valor>120</valor>
                    <data>
                        <tax name="iva" value="25.0"></tax>
                    </data>
                </item>
                <item>
                    <nombre>Motorcycle</nombre>
                    <valor>900</valor>
                    <data>
                        <tax name="iva" value="120.50"></tax>
                    </data>
                </item>
            </items>
        </tienda>]]>
    </detalle>
</data>

I am working with import xml.etree.ElementTree as ET for example

import xml.etree.ElementTree as ET

xml = ET.fromstring(stringBase64)
ite = xml.find('.//detalle').text
tixml = ET.fromstring(ite)

Solution

  • You can use BeautifulSoup4 (BS4) to do this.

    from bs4 import BeautifulSoup
    
    #Read XML file
    with open("example.xml", "r") as f:
        contents = f.readlines()
    
    #Create Soup object
    soup = BeautifulSoup(contents, 'xml')
    
    #find all the item tags
    item_tags = soup.find_all("item") #returns everything in the <item> tags
    
    #find the nombre and valor tags within each item
    results = {}
    for item in item_tags:
        num = item.find("nombre").text
        val = item.find("valor").text
        results[str(num)] = val
    
    #Prints dictionary with key value pairs from the xml
    print(results)