Search code examples
pythonxmlxpathlxmlkml

How to return a list of folder elements in this kml file?


Here is the top of the file

<?xml version="1.0" encoding="UTF-8"?>
<kml xmlns="http://www.opengis.net/kml/2.2">
  <Document>
    <Folder>
  <name>Points</name>
  <Placemark>
    <name>Port Saeed, Dubai</name>
    <styleUrl>#icon-1899-0288D1-nodesc</styleUrl>
    <Point>
      <coordinates>
        55.3295568,25.2513145,0
      </coordinates>
    </Point>
  </Placemark>
  <Placemark>
    <name>Retail Location #1</name>
    <description>Paris, France</description>
    <styleUrl>#icon-1899-0288D1</styleUrl>
    <Point>
      <coordinates>
        2.3620605,48.8867304,0
      </coordinates>
    </Point>
  </Placemark>
  <Placemark>
    <name>Odessa Oblast</name>
...

I would like to extract the "Folder" elements

Here is my code.

tree = ET.parse(kml)
root = tree.getroot()

for element in root:
    print element.findall('.//{http://www.opengis.net/kml/2.2/}Folder')

right now this prints []. I believe its a problem with the namespace. I can't figure out how to create that string? Also, perhaps its worth using xpath instead? I think I would have the same problem with the namespace though


Solution

  • Consider iterating through all descendants of Folder as this node contains child and grandchildren elements. Also, your namespace prefix used in parsing should not end with a forward slash.

    import xml.etree.ElementTree as ET
    
    root = ET.fromstring('''<?xml version="1.0" encoding="UTF-8"?>
    <kml xmlns="http://www.opengis.net/kml/2.2">
      <Document>
        <Folder>
          <name>Points</name>
          <Placemark>
            <name>Port Saeed, Dubai</name>
            <styleUrl>#icon-1899-0288D1-nodesc</styleUrl>
            <Point>
              <coordinates>
            55.3295568,25.2513145,0
              </coordinates>
            </Point>
          </Placemark>
          <Placemark>
            <name>Retail Location #1</name>
            <description>Paris, France</description>
            <styleUrl>#icon-1899-0288D1</styleUrl>
            <Point>
              <coordinates>
            2.3620605,48.8867304,0
              </coordinates>
            </Point>
          </Placemark>
        </Folder>
      </Document>
    </kml>''')
    
    # FIND ALL FOLDERS
    for i in root.findall('.//{http://www.opengis.net/kml/2.2}Folder'):
        # FIND ALL FOLDER'S DESCENDANTS
        for inner in i.findall('.//*'):
            data = inner.text.strip()     # STRIP LEAD/TRAIL WHITESPACE
            if len(data) > 1:             # LEAVE OUT EMPTY ELEMENTS
                print(data)
    
    # Points
    # Port Saeed, Dubai
    # icon-1899-0288D1-nodesc
    # 55.3295568,25.2513145,0
    # Retail Location #1
    # Paris, France
    # #icon-1899-0288D1
    # 2.3620605,48.8867304,0
    

    For a nested list, append node text to a list where each inner list corresponds to each Folder:

    data = []
    for i in root.findall('.//{http://www.opengis.net/kml/2.2}Folder'):
        inner = []
        for t in i.findall('.//*'):
            txt = t.text.strip()
            if len(txt) > 1:
                inner.append(txt)
    
        data.append(inner)