Search code examples
pythonpython-3.xxmlxml-parsingattributes

How to get and update the xml attribute using Python


I'm trying to update the attribute value in the xml using Python ElementTree, when I try to access the attribute then I'm getting a message

None

Here is the xml data

xmldata='''<?xml version="1.0" encoding="UTF-8"?>
<WMS_Capabilities version="1.3.0" xmlns="http://www.opengis.net/wms" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.opengis.net/wms http://schemas.opengis.net/wms/1.3.0/capabilities_1_3_0.xsd">
    <!-- Service Metadata -->
    <Capability>
        <Request>
            <GetCapabilities>
                <Format>text/xml</Format>
                <DCPType>
                    <HTTP>
                        <Get>
                            <OnlineResource xmlns:xlink="http://www.w3.org/1999/xlink"
                            xlink:type="simple"
                            xlink:href="http://localhost:8080" />
                        </Get>
                        <Post>
                            <OnlineResource xmlns:xlink="http://www.w3.org/1999/xlink"
                            xlink:type="simple"
                            xlink:href="http://localhost:8080" />
                        </Post>
                    </HTTP>
                </DCPType>
            </GetCapabilities>
            <GetMap>
                <Format>image/jpg</Format>
                <Format>image/png</Format>
                <DCPType>
                    <HTTP>
                        <Get>
                            <OnlineResource xmlns:xlink="http://www.w3.org/1999/xlink" 
                            xlink:type="simple" 
                            xlink:href="http://localhost:8080" />
                        </Get>
                    </HTTP>
                </DCPType>
            </GetMap>
            <GetFeatureInfo>
                <Format>text/xml</Format>
                <Format>text/plain</Format>
                <Format>text/html</Format>
            </GetFeatureInfo>
        </Request>
        <Exception>
            <Format>XML</Format>
            <Format>INIMAGE</Format>
            <Format>BLANK</Format>
        </Exception>
    </Capability>
</WMS_Capabilities>'''

ns='{http://www.opengis.net/wms}'
myroot = ET.fromstring(xmldata)
for x in myroot.findall(ns+"Capability/" + ns + "Request/" + ns + "GetMap/" + ns + "DCPType/" + ns + "HTTP/" + ns + "Get/" + ns + "OnlineResource"):
    print(x)

output:

<Element '{http://www.opengis.net/wms}OnlineResource' at 0x7f0e0ab8e130>

selectedNode=myroot.find(ns + "Capability/" + ns + "Request/" + ns + "GetMap/" + ns + "DCPType/" + ns + "HTTP/" + ns + "Get/" + ns + "OnlineResource").tag
print(selectedNode)

output:

{http://www.opengis.net/wms}OnlineResource

selectedAttribute=myroot.find(ns + "Capability/" + ns + "Request/" + ns + "GetMap/" + ns + "DCPType/" + ns + "HTTP/" + ns + "Get/" + ns + "OnlineResource").attrib['xlink:href']
print(selectedAttribute)

output:

 Traceback (most recent call last): File "<string>", line 59, in
 <module> KeyError: 'xlink:href'

Here I wanted to update the value of the attribute 'xlink:href' and save the xml file.

I'm not able to access the attribute, and getting an error while trying to access it. I'm not getting whether I have followed the right method to access attribute value or not. Need suggestions to update the value.

I have referred below mentioned links,

  1. How to extract xml attribute using Python ElementTree
  2. How can I get the href attribute of *any* element in an XML (included deeply nested ones) using xpath?
  3. How do I extract value of XML attribute in Python?
  4. Reading XML file and fetching its attributes value in Python
  5. https://docs.python.org/3/library/xml.etree.elementtree.html
  6. https://www.edureka.co/blog/python-xml-parser-tutorial/

Solution

  • It seems that in order to access the href attribute, the field name must be prefixed with the value of the xlink attribute. Continuing from your sample code:

    >>> node = myroot.find(ns + "Capability/" + ns + "Request/" + ns + "GetMap/" + ns + "DCPType/" + ns + "HTTP/" + ns + "Get/" + ns + "OnlineResource")
    >>> node.attrib['{http://www.w3.org/1999/xlink}href']
    'http://localhost:8080'
    

    If you don't know this prefix beforehand, you should still be able to find the href field at runtime with this method:

    for key, val in node.attrib.items():
        if key.endswith('href'):
            print(val)