I have a KML file that contains a lot of information about the location i.e community name, area length, name, etc. 1. How can I extract information and convert all the information into the data frame? 2. How to load KML file in pandas plot the polygons?
'''
<?xml version="1.0" encoding="utf-8" ?>
<kml xmlns="http://www.opengis.net/kml/2.2">
<Document id="root_doc">
<Schema name="USA_community_file" id="Dubai_community_file">
<SimpleField name="CNAME_E" type="string"></SimpleField>
<SimpleField name="CNAME_A" type="string"></SimpleField>
<SimpleField name="OBJECTID" type="int"></SimpleField>
<SimpleField name="LABEL_E" type="string"></SimpleField>
<SimpleField name="LABEL_A" type="string"></SimpleField>
<SimpleField name="C_PREFIX_E" type="string"></SimpleField>
<SimpleField name="C_PREFIX_A" type="string"></SimpleField>
<SimpleField name="COMMUNITY_" type="string"></SimpleField>
<SimpleField name="COMMUNITY1" type="string"></SimpleField>
<SimpleField name="DGIS_ID" type="string"></SimpleField>
<SimpleField name="COMM_NUM" type="int"></SimpleField>
<SimpleField name="NDGIS_ID" type="int"></SimpleField>
<SimpleField name="SHAPE_AREA" type="float"></SimpleField>
<SimpleField name="SHAPE_LEN" type="float"></SimpleField>
</Schema>
<Folder><name>Dubai_community_file</name>
<Placemark>
<Style><LineStyle><color>ff0000ff</color></LineStyle><PolyStyle><fill>0</fill></PolyStyle></Style>
<ExtendedData><SchemaData schemaUrl="#Dubai_community_file">
<SimpleData name="CNAME_E">WORLD ISLANDS</SimpleData>
<SimpleData name="CNAME_A">جزالعالم</SimpleData>
<SimpleData name="OBJECTID">52</SimpleData>
<SimpleData name="LABEL_E">WORLD ISLANDS</SimpleData>
<SimpleData name="LABEL_A">جزر االم</SimpleData>
<SimpleData name="C_PREFIX_E">Community:</SimpleData>
<SimpleData name="COMMUNITY_">WORLD ISLANDS - 303</SimpleData>
<SimpleData name="COMMUNITY1">جزر العالم - 303</SimpleData>
<SimpleData name="DGIS_ID">00049</SimpleData>
<SimpleData name="COMM_NUM">33</SimpleData>
<SimpleData name="NDGIS_ID">49</SimpleData>
<SimpleData name="SHAPE_AREA">740999.1710000634</SimpleData>
<SimpleData name="SHAPE_LEN">322.040010000</SimpleData>
</SchemaData></ExtendedData>
<MultiGeometry><Polygon><altitudeMode>clampToGround</altitudeMode><outerBoundaryIs><LinearRing><altitudeMode>clampToGround</altitudeMode><coordinates>55.199335819934504,25.209713479513255 55.193858362513538,25.20134197109752 55.187450889885667,25.195407028080979 55.185431717640654,25.194070811154006 55.180434849528297,25.191453404047706 55.179681664417046,25.191007641430815 55.174605170063955,25.188003185589196 55.168775490599842,25.186099616784588 55.161637107582351,25.184315021030216 55.15818688912384,25.18443399408045 55.155212562866552,25.18443399408045 55.152357209659556,25.18443399408045 55.149620829502737,25.18443399408045 55.145218826642065,25.185266805432605 55.140340931580113,25.186575508985754 55.1366767021133,25.188598050840653 55.13323857561312,25.190168578888347 55.132369737210468,25.190858538796249 55.1233357053876,25.193356972852314 55.127253896048046,25.197521029612517 55.125112381142742,25.202636870775109 55.23684704539301,25.206919900585604 55.123565731489066,25.2097752537926 55.123803677589592,25.21179779564749 55.124517515891284,25.214415202754026 55.125231354192977,25.218103367313006 55.125469300293673,25.221077693570294 55.125541469788061,25.221528752909364 55.125945192494896,25.224052019827582 55.127015949947349,25.2273832652353 55.128681572651487,25.230595537593615 55.128782793301468,25.230944186498903 55.129752330104168,25.2342837021652 55.130585141456095,25.236782136208717 55.13177487195901,25.238804678063786 55.13415433296484,2.24177900431074 55.135344063467755,25.245586141930346 55.13724763227259,25.250464036992241 55.139270174127489,25.25260555897545 55.142601419535708,25.254628093752444 55.150453640854892,25.257007554758275 55.159733538777516,25.2605767626702 55.16603911044308,25.26295620727285 55.172130530618062,25.264026964725531 55.180553822578531,25.262884823442789 55.189262649859927,25.260172237896029 55.196344507220033,25.257622769246325 55.198756699273247,2.25320017149003 55.201255133329312,25.248084575986411 55.203396648234616,25.242492842622767 55.205062270938754,25.234759594353818 55.205419190089515,25.228572995738546 55.2042294595866,25.223100235425193 55.201493079430008,25.215366987156244 55.199335819934504,25.209713479513255</coordinates></LinearRing></outerBoundaryIs></Polygon></MultiGeometry>
</Placemark>'''
Tried - Not succeed
https://blog.toadworld.com/2017/11/03/python-for-data-science-importing-xml-to-pandas-dataframe
https://stackoverflow.com/questions/13712132/extract-coordinates-from-kml-batchgeo-file-with-python
Sample code
from pykml import parser
root = parser.fromstring(open('test.kml', 'r').read().encode('utf-8'))
print root.Document. USA_community_file. COMMUNITY_
def getvalueofnode( node ):
return node.text if node is not None else None
for node in parsedXML.getroot():
name = node.attrib.get('Schema name')
cname = node.find('COMMUNITY_')
import pandas as pd
dfcols = ['name','cname']
df = pd.DataFrame(columns=dfcols)
df = df.append( pd.Series(
[name, getvalueofnode(cname), ],
index=dfcols) ,ignore_index=True)
I have tried a few methods to extract specific information.
- Method 1 Use simple regex using python in KML (Subline text).
- Method 2 (Convert to CSV)
import geopandas as gpd df = gpd.read_file(path + "data.geojson")