Search code examples
pythonxmllxmlcdata

How to add space before and after CDATA in XML file


I want to create a function to modify XML content without changing the format. I managed to change the text but I can't do it without changing the format in XML. So now, what I wanted to do is to add space before and after CDATA in a XML file.

Default XML file:

<?xml version="1.0" encoding="utf-8"?>
<Mapsxmlns="http://www.semi.org">
  <Map>
    <Device>
      <ReferenceDevice/>
      <Bin>
        <Bin Bin="001"/>
      </Bin>
      <Data>
        <Row> <![CDATA[001 001 001]]> </Row>
      </Data>
    </Device>
  </Map>
</Maps>

And I am getting this result:

<?xml version="1.0" encoding="utf-8"?>
<Mapsxmlns="http://www.semi.org">
  <Map>
    <Device>
      <ReferenceDevice/>
      <Bin>
        <Bin Bin="001"/>
      </Bin>
      <Data>
        <Row><![CDATA[001 001 099]]></Row>
      </Data>
    </Device>
  </Map>
</Maps>

However, I want the new xml to be like this:

<?xml version="1.0" encoding="utf-8"?>
<Mapsxmlns="http://www.semi.org">
  <Map>
    <Device>
      <ReferenceDevice/>
      <Bin>
        <Bin Bin="001"/>
      </Bin>
      <Data>
        <Row> <![CDATA[001 001 099]]> </Row>
      </Data>
    </Device>
  </Map>
</Maps>

Here is my code:

from lxml import etree as ET

def xml_new(f,fpath,newtext,xmlrow):
    xmlrow = 19
    parser = ET.XMLParser(strip_cdata=False)
    tree = ET.parse(f, parser)
    root = tree.getroot()
    for child in root:
       value = child[0][2][xmlrow].text

    text = ET.CDATA("001 001 099")
    child[0][2][xmlrow] = ET.Element('Row')
    child[0][2][xmlrow].text = text
    child[0][2][xmlrow].tail = "\n"
    ET.register_namespace('A', "http://www.semi.org")
    tree.write(fpath,encoding='utf-8',xml_declaration=True)
    return value

Anyone can help me on this? thanks in advance!


Solution

  • thanks for all your help. I have found another way to achieve the result I want

    This is the code:

    # what you want to change
    replaceby = '020]]> </Row>\n'
    # row you want to change
    row = 1
    # col you want to change based on list
    col = 3
    file = open(file,'r')
    line = file.readlines()
    i = 0
    editedXML=[]
    for l in line:
        if 'cdata' in l.lower():
            i=i+1
            if i == row:
                oldVal = l.split(' ')
                newVal = []
                for index, old in enumerate(oldVal):
                    if index == col:
                        newVal.append(replaceby)
                    else:
                        newVal.append(old)
                editedXML.append(' '.join(newVal))
            else:
                editedXML.append(l)
        else:
            editedXML.append(l)
    file2 = open(newfile,'w')
    file2.write(''.join(editedXML))
    file2.close()