Search code examples
pythonxmlparsingodoo-14

How to parse data inside XML?


any idea how to parse this kind of record? This record has data on it.

       <record id="1" model="custom.model>
            <field name="name">Create</field>
            <field name="email_from">[email protected]</field>
            <field name="email_to">[email protected]</field>
            <field name="email_subject">Create new company</field>
            <field name="email_body">
                <![CDATA[
                <record>
                    <field name="process">Create</field>
                    <field name="model">res.company</field>
                    <field name="name">XYZ Company</field>
                    <field name="currency_id">base.USD</field>
                </record>
                ]]>
            </field>
            <field name="email_read">False</field>
        </record>

Solution

  • Assuming you are looking for the data inside CDATA the code below finds this section and parse it as xml.

    import xml.etree.ElementTree as ET
    
    
    xml = '''<record id="1" model="custom.model">
                <field name="name">Create</field>
                <field name="email_from">[email protected]</field>
                <field name="email_to">[email protected]</field>
                <field name="email_subject">Create new company</field>
                <field name="email_body">
                    <![CDATA[
                    <record>
                        <field name="process">Create</field>
                        <field name="model">res.company</field>
                        <field name="name">XYZ Company</field>
                        <field name="currency_id">base.USD</field>
                    </record>
                    ]]>
                </field>
                <field name="email_read">False</field>
            </record>'''
    outer_root = ET.fromstring(xml)
    email = outer_root.find('.//field[@name="email_body"]')
    inner_root = ET.fromstring(email.text)
    for field in inner_root.findall('field'):
      print(f'{field.attrib["name"]} -> {field.text}')
    

    output

    process -> Create
    model -> res.company
    name -> XYZ Company
    currency_id -> base.USD