Search code examples
pythonxml-parsinglxmlelementtreepykml

Trying to parse KML file with pyKML and extract data


<kml>
<Document>
    <name>EP-1B-03</name>
    <Style id="narda">
        <LineStyle>
            <color>ffff0000</color>
            <width>3</width>
        </LineStyle>
    </Style>
    <Folder>
        <name>WIDEBAND [1]</name>
        <visibility>0</visibility>
        <Folder>
            <name>[0 V/m &lt; X ≤ 0.15 V/m]</name>
            <visibility>0</visibility>
            <Placemark>
                <name>Value WIDEBAND: Low V/m</name>
                <styleUrl>#lvl_0_1</styleUrl>
                <visibility>0</visibility>
                <description><![CDATA[<table><tr><th>Date and time: </th><td>05/02/20 09:32:28</td></tr><tr><th>Temperature: </th><td>23 C° </td></tr><tr><th>Relative humidity: </th><td>23 % </td></tr><tr><th>Battery: </th><td>3.09 V </td></tr><tr><th>Speed: </th><td>4 km/h </td></tr><tr><th>Acceleration x: </th><td>-0.02 g </td></tr><tr><th>Acceleration y: </th><td>-0.01 g </td></tr><tr><th>Acceleration z: </th><td>0.00 g </td></tr></table>]]></description>
                <Point>
                    <coordinates>8.16007,44.0748641666667,0 </coordinates>
                </Point>
            </Placemark>
            <Placemark>
                <name>Value WIDEBAND: Low V/m</name>
                <styleUrl>#lvl_0_1</styleUrl>
                <visibility>0</visibility>
                <description><![CDATA[<table><tr><th>Date and time: </th><td>05/02/20 09:32:28</td></tr><tr><th>Temperature: </th><td>23 C° </td></tr><tr><th>Relative humidity: </th><td>23 % </td></tr><tr><th>Battery: </th><td>3.09 V </td></tr><tr><th>Speed: </th><td>4 km/h </td></tr><tr><th>Acceleration x: </th><td>-0.01 g </td></tr><tr><th>Acceleration y: </th><td>0.01 g </td></tr><tr><th>Acceleration z: </th><td>-0.02 g </td></tr></table>]]></description>
                <Point>
                    <coordinates>8.1600825,44.0748745833333,0 </coordinates>
                </Point>
            </Placemark>
            <Placemark>
                <name>Value WIDEBAND: Low V/m</name>
                <styleUrl>#lvl_0_1</styleUrl>
                <visibility>0</visibility>
                <description><![CDATA[<table><tr><th>Date and time: </th><td>05/02/20 09:32:28</td></tr><tr><th>Temperature: </th><td>23 C° </td></tr><tr><th>Relative humidity: </th><td>23 % </td></tr><tr><th>Battery: </th><td>3.09 V </td></tr><tr><th>Speed: </th><td>4 km/h </td></tr><tr><th>Acceleration x: </th><td>-0.01 g </td></tr><tr><th>Acceleration y: </th><td>0.01 g </td></tr><tr><th>Acceleration z: </th><td>-0.02 g </td></tr></table>]]></description>
                <Point>
                    <coordinates>8.160075,44.0748683333333,0 </coordinates>
                </Point>
            </Placemark>
    </Folder>

this is my kml file

and here is my code:

from pykml import parser
from os import path
import pandas as pd
from lxml import etree
from pykml.factory import nsmap

kml_file = path.join( r'C:\Users\paliou\Documents\ep-1b-03.kml')
namespace = {"ns" : nsmap[None]}
with open(kml_file) as f:
    tree = parser.parse(f)
    root = tree.getroot()
    N = 0
    placemarks = {}
    for ch in root.Document.Folder.Folder.Placemark.getchildren():
        name = ch[0]
        print (name)
        for pl in ch.getchildren():
            print (pl)

why this returns no error or data? i want to extract latitudes, longtitudes, names, and some info from description tab. i just made it to print the first tag data only from Placemark.name, Placemark.style, Placemark.visibility, Placemark.description

return: Value WIDEBAND: Low V/m

#lvl_0_1

0

Date and time: 05/02/20 09:32:28Temperature: 23 C° Relative humidity: 23 % Battery: 3.09 V Speed: 4 km/h Acceleration x: -0.02 g Acceleration y: -0.01 g Acceleration z: 0.00 g

8.16007,44.0748641666667,0 is there a better way to do it? somewhere i found a sample like .findall(".//ns:Placemark",namesapces=namespace) is there a way to retrieve the data with tags? (because i couldn't make it work)


Solution

  • You are doing more complicated than necessary.

    import xml.etree.ElementTree as ET
    from pathlib import Path
    
    
    kml_file_path = Path(r'68083057.xml')
    tree = ET.parse(kml_file_path)
    root = tree.getroot()
    print(root.tag)  # kml
    for placemark_node in root.findall("Document/Folder/Folder/Placemark"):
        print("Placemark =====")
        for child in placemark_node:
            print(" ", child.tag, child.text)
    

    prints me

    Placemark =====
      name Value WIDEBAND: Low V/m
      styleUrl #lvl_0_1
      visibility 0
      description <table> ...
      Point 
                        
    Placemark =====
      name Value WIDEBAND: Low V/m
      styleUrl #lvl_0_1
      visibility 0
      description <table> ...
      Point 
                        
    Placemark =====
      name Value WIDEBAND: Low V/m
      styleUrl #lvl_0_1
      visibility 0
      description <table> ...
      Point