Search code examples
pythonelasticsearchindexinggeojsonkibana-7

Geopandas dataframe to GeoJSON to Elasticsearch index?


I've a question that is related to this question: I'm relatively new to python and now have started to visualize in Kibana, which I'm brand new at (as in, I've never used it before). Now I have a pandas datafram geoseries like this:

    ID      Geometry
0   9417    POLYGON ((229611.185 536552.731, 229611.100 53...
1   3606    POLYGON ((131122.280 460609.117, 131108.312 46...
2   1822    POLYGON ((113160.653 517762.384, 113169.755 51...
3   7325    POLYGON ((196861.725 470370.632, 196869.990 47...
4   9258    POLYGON ((201372.387 579807.340, 201373.195 57...

And I would like to create a map with these polygons in kibana but I really don't know how. I've read different parts on elasticsearch and stackoverflow but I can't get the right pieces together. The thing is, that in our project we want to import data in python, preprocess it a bit, and export it to kibana. So there is a Python - GeoJSON - Elasticsearch [7.6] process, and all the literature I found, does not include all these 3 assets so I'm not sure how to proceed.

I also did try to save the file as a GeoJSON and then import it via the Kibana dashboard, in the map visualization like this instruction says. When I import the data, it won't give my file an index and it therefore won't visualize any of my data.

I did read about how you can't index a whole polygon but I should split it into coordinates. My problem is that I can't fint a good way to do this in python. Also I read that the index in Elasticsearch should have the right mapping for geo indexing. But again, I get stuck in creating this geo mapping from python.

Could someone help me :)?


Solution

  • This should get you started:

    1. Import & initialize
    import shapely.geometry
    import geopandas
    from elasticsearch import Elasticsearch
    import json
    
    es = Elasticsearch(['http://localhost:9200'])
    geoindex = None
    
    1. Fetch or create the index(+mapping, if needed)
    try:
        geoindex = es.indices.get('geoindex')
    except Exception:
        geoindex = es.indices.create('geoindex', {
            "mappings": {
                "properties": {
                    "polygon": {
                        "type": "geo_shape",
                        "strategy": "recursive"
                    }
                }
            }
        })
    
    
    1. Dump as json and load back into a dict (inspired by this; there must be a cleaner way, I suspect)
    shapely_polygon = shapely.geometry.Polygon([(0, 0), (0, 1), (1, 0)])
    geojson_str = geopandas.GeoSeries([shapely_polygon]).to_json()
    
    1. Iterate & sync to ES
    for feature in json.loads(geojson_str)['features']:
        es.index('geoindex', { "polygon": {
            "type": "polygon",
            "coordinates": feature['geometry']['coordinates']
        }}, id=feature['id'])
    
    1. Verify
    count = es.count({}, 'geoindex')
    print(count)
    
    1. Visualize