Search code examples
pythonopenstreetmapgeopandasoverpass-api

Data from OSM (overpy) to geodataframe with polygons


I try to put OSM data (some polygons) to geodataframe. Export from OSM contains LineString. But in the end i need to converte all data into geodataframe in this format:

0 -> name_from_tag_first_area -> polygon (or multipolygon) type with coordinates

1 -> name_from_tag_second_area -> polygon (or multipolygon) type with coordinates

And then i will use this GeoDataFrame to visualize this polygons.

import overpy
import requests
import json
import geopandas as gpd

from shapely.geometry import shape

url = "https://maps.mail.ru/osm/tools/overpass/api/interpreter"
query = """[out:json];
area['boundary' = 'administrative']['name' = 'Москва'] -> .MSK;
(
relation(area.MSK)['admin_level' = 8]['boundary' = 'administrative']['name'='Бескудниковский район'];
relation(area.MSK)['admin_level' = 8]['boundary' = 'administrative']['name'='район Восточное Дегунино'];
);
convert item ::=::,::geom=geom(),_osm_type=type();
out geom;"""
response = requests.get(url, params={'data': query})
data = response.json()

geo_df = gpd.GeoDataFrame(data['elements'])

wrong result

In my dataframe not a polygon - only GeometryCollection with LineString. Please could you explain how I can do this task?


Solution

  • Apart from an additional import I will keep the beginning of your script as the same.

    import overpy
    import requests
    import json
    import geopandas as gpd
    from shapely.geometry import shape, MultiPolygon
    from shapely.ops import polygonize
    
    url = "https://maps.mail.ru/osm/tools/overpass/api/interpreter"
    query = """[out:json];
    area['boundary' = 'administrative']['name' = 'Москва'] -> .MSK;
    (
    relation(area.MSK)['admin_level' = 8]['boundary' = 'administrative']['name'='Бескудниковский район'];
    relation(area.MSK)['admin_level' = 8]['boundary' = 'administrative']['name'='район Восточное Дегунино'];
    );
    convert item ::=::,::geom=geom(),_osm_type=type();
    out geom;"""
    response = requests.get(url, params={'data': query})
    data = response.json()
    

    Your GeoDataFrame contains GeoJSON Geometries. In the first step the geometries have to be converted to shapely geometries with the shape function. After that we convert the GeometryCollections to Polygons with the polygonize function.

    results_dict = [{
        'id': element['id'],
        'name': element['tags']['name'],
        'geometry': MultiPolygon(polygonize(shape(element['geometry']))),
    } for element in data['elements']]
    

    Now you can convert your results_dict into a GeoDataFrame. Note that a boundary could be multiple Polygons. Maybe you want each Polygon to has it's own row. For this you can use the explode function.

    results_gdf = gpd.GeoDataFrame(results_dict)
    # or
    results_gdf = gpd.GeoDataFrame(results_dict).exlode()