Search code examples
pythonpandasdataframegisgeopandas

Unable to create geopandas geometry from geojson column


I have a csv that has a field that contains a valid geojson representation of the geometry of my objects. I am able to load it as a json dict that I then set in the geopandas set_geometry without errors. While everything executed properly, I cannot figure out why my geometry is empty. Here is csv: enter image description here

Here is code:

    from shapely.geometry import shape
df =pd.read_csv('c:/random_polygon.csv')
def parse_geom(geom_str):
    try:
        return shape(json.loads(geom_str))
    except (TypeError, AttributeError): 
        return None

["geometry"] = df["geo_properties"].apply(parse_geom)
df = df[df.geo_properties != None]
gdf = gpd.GeoDataFrame(df, geometry="geometry")
gdf = gdf.set_crs('epsg:4326')
print(gdf.head())

Here is the geojson:

{
  "type": "FeatureCollection",
  "features": [
    {
      "type": "Feature",
      "properties": {},
      "geometry": {
        "type": "Polygon",
        "coordinates": [
          [
            [
              -89.77632522583008,
              34.07143110146331
            ],
            [
              -89.78070259094238,
              34.05422388685686
            ],
            [
              -89.76662635803223,
              34.04825031787262
            ],
            [
              -89.74491119384766,
              34.04917482631512
            ],
            [
              -89.74748611450195,
              34.0705068357665
            ],
            [
              -89.75503921508789,
              34.08095756019248
            ],
            [
              -89.77726936340332,
              34.0810286491402
            ],
            [
              -89.78233337402344,
              34.08053102525307
            ],
            [
              -89.77632522583008,
              34.07143110146331
            ]
          ]
        ]
      }
    }
  ]
}

Solution

  • It looks like each cell in your DataFrame's geo_properties column contains a full GeoJSON string. This isn't working because Features and FeatureCollections are not supported by shapely.

    Try parsing each cell with geopandas and extracting the GeometryArray:

    In [18]: df.geo_properties.apply(
        ...:     lambda x: gpd.read_file(x, driver="GeoJSON").geometry.values
        ...: )
        ...:
    Out[18]:
    0    [POLYGON ((-89.77632522583008 34.0714311014633...
    Name: geo_properties, dtype: object