Search code examples
pythonpython-3.xxmlxml-parsingcdata

Extract CDATA from XML with Python


I'm working with a file XML like that :

import xml.etree.ElementTree as ET

xml = '''
<root>
    <a name='name1' label='label1'
      <b>
        <result para='1'
      </b>
    </a>
    <name><![CDATA[<?xml version='1.0'?>
    <name2><b a="" n="label1" x="32"/><b a="" n="label2" x="4"/></b></name2>]]></name>
</root>
'''

myroot = ET.fromstring(xml)

I want to extract the content of CDATA to be able to extract some information and analyze it as a string.

I haven't found a way to do it yet. Has anyone ever done that ? Or maybe anyone have an idea to help me please ?

Thanks in advance


Solution

  • First of all, your xml file doesn't look well formed. Some tags are not closed (a and result). Other than that, you can extract content with .find method. name_content = myroot.find('name').text