Search code examples
pythonxmltagscelementtree

URL in all XML Element Tags


I am using the cElementTree package in Python and am having trouble with the element's tags. They include an attribute in the tag itself. It seems to be a problem with the parser. Please review the code below:

from xml.etree import cElementTree as ET
path='C:\\Users\\myusername\\Desktop\\test.xml'
tree=ET.parse(path)
root=tree.getroot()

root.tag
>>> '{http://www.aftmark.org}DATA'

Where the xml is like this:

<DATA xmlns:xsd="http://www.w.org/2008/XMLsca" xmlns="http://www.aftmark.org">
  <Header>
    <DATAVersion>6.5</DATAVersion>
  </Header>
  <Items>
    <Item MaintenanceType="A">
      <HazardousMaterialCode>N</HazardousMaterialCode>
      <ExtendedInformation>
    </Item>

Any idea why the url '{http://www.aftmark.org}' is included? I am parsing a lot of files and that url changes. The DATA tag doesn't though. (Backup is to use the root.tag and take that url and append it to all ET.find() later on) Thanks!


Solution

  • It's because DATA (and all descendants) are in the default namespace http://www.aftmark.org.

    What you're seeing is the namespace uri and the local name expanded (aka Clark Notation).

    See here for more info on namespaces in ElementTree.

    See here for more info on XML namespaces in general.

    Also, see this answer for another way to capture unknown namespaces to use in find/findall.