Search code examples
pythonxmlelementtreeminidom

replace xml values with python


I've tried 2 ways to replace the values within given nodes in an xml file and it's not working.

My File:

<?xml version="1.0" encoding="UTF-8"?>
<OrdSet xmlns="tfs" xmlns:xsi="http://www.sample.org/XMLSchema-instance" xsi:schemaLocation="tfs tfs.xsd" Version="25">
    <Msg>
        <MsgCreate>
            <Date>20160324</Date>
            <Time>111057</Time>
            <Src>
                <SrcType>D</SrcType>
                <DlrCode>0001</DlrCode>
            </Src>
            <Target>
                <TargetType>F</TargetType>
                <MgmtCode>BTG</MgmtCode>
            </Target>
        </MsgCreate>
        <MsgType>
            <OrdReq>
                <ActnCode>NEW</ActnCode>
                <SrcID>64698602107101</SrcID>
                <RepCode>0000</RepCode>
                <OrdDtl>
                    <AcctLookup>
                        <MgmtCode>ABC</MgmtCode>
                        <FundAcctID>984575</FundAcctID>
                        <AcctDesig>2</AcctDesig>
                    </AcctLookup>
                    <TrxnDtl>
                        <Buy>
                            <TrxnTyp>5</TrxnTyp>
                            <FundID>205</FundID>
                            <Amt>
                                <AmtType>D</AmtType>
                                <AmtValue>600.00</AmtValue>
                            </Amt>
                        </Buy>
                    </TrxnDtl>
                </OrdDtl>
            </OrdReq>
        </MsgType>
    </Msg>
omitted ...

My goal is to replace the ActnCode value from NEW to CAN.

I.e.,  <ActnCode>CAN</ActnCode>

Attempt #1: Script runs fine but the values are still "NEW" in the output file. Nothing seems to be changed.

import xml.etree.ElementTree as ET 
tree = ET.parse("~\input.xml")
root = tree.getroot()
elems = tree.findall('ActnCode')
for elem in elems:
	elem.txt = 'CAN'
tree.write("~\output.xml")

Attempt #2: Script runs correctly as well but it's not working as intended.

xmldoc = minidom.parse('~input.xml')
action_code = xmldoc.getElementsByTagName('ActnCode')
firstchild = action_code[0]
firstchild.setAttribute('ActnCode', 'CAN')

result:
<ActnCode ActnCode="CAN">NEW</ActnCode>

Ultimately, I want python to look through the xml doc, find all ActnCode nodes and change the values to "CAN". Any help will be appreciated.


Solution

  • You have several problems. The element you are looking for has a namespace inherited from the default namespace in <OrdSet xmlns="..." and that needs to be included in the find. Then, findall only looks at children unless you add ElementTree's "pseudo-xsl" subtree search pattern. And finally, you need to change the text attribute, not `txt.

    Abbreviated XML for test...

    <?xml version="1.0" encoding="UTF-8"?>
    <OrdSet xmlns="tfs">
        <Msg>
            <MsgCreate>
                <ActnCode>NEW</ActnCode>
                <SrcID>64698602107101</SrcID>
                <RepCode>0000</RepCode>
                <OrdDtl>
                    <AcctLookup>
                        <MgmtCode>ABC</MgmtCode>
                        <FundAcctID>984575</FundAcctID>
                        <AcctDesig>2</AcctDesig>
                    </AcctLookup>
                </OrdDtl>
            </MsgCreate>
       </Msg>
    </OrdSet>
    

    And your code becomes

    import xml.etree.ElementTree as ET 
    tree = ET.parse("input.xml")
    root = tree.getroot()
    elems = tree.findall('.//{http://abc}ActnCode')
    print('elems', elems)
    for elem in elems:
        elem.text = 'CAN'
    tree.write("output.xml")
    

    EDIT

    You can do more complicated XPATH queries with lxml than with ElementTree. If you want to limit which <ActnCode> elements you process, this predicate will look at other elements to refine the selection. The stuff inside the angle brackets is essentially a filter that will remove non-matching nodes. Here I limit to nodes with a sibling OrdDtl/AcctLookup/FundAcctID of 984575

    import lxml.etree
    tree = lxml.etree.parse('input.xml')
    elems = tree.xpath('//tfs:ActnCode[../tfs:OrdDtl/tfs:AcctLookup/tfs:FundAcctID/text()="984575"]',
        namespaces={'tfs':'tfs'})
    elems2 = tree.xpath('.//tfs:ActnCode[../tfs:OrdDtl]',
        namespaces={'tfs':'tfs'})
    print('elems', elems)
    for elem in elems:
        elem.text = 'CAN'
    tree.write("output.xml")