Search code examples
pythongeopandasshapely

Using Geopandas and Shapely to merge polygons (but not always intersecting) within a single geometry column


I have the following structure (this isn't the exact data but a reflection of it form) of data where Lat, Long is the location of the farm; and the distribution area is the GeoPandas geometry column/info:

Farm type Lat Long Avg yield Max yeild geometry
Apples x1 y1 50 100 POLYGON (a)
Apples x1 y1 50 100 POLYGON (b)
Apples x1 y1 50 100 POLYGON (c)
Bananas x2 y2 100 150 POLYGON (d)
Bananas x2 y2 100 150 POLYGON (e)
Bananas x2 y2 100 150 POLYGON (f)
Oranges x3 y3 70 100 POLYGON (g)
Oranges x3 y3 70 100 POLYGON (h)
Oranges x3 y3 70 100 POLYGON (i)

As you can see the descriptive information is the same for each instance of Apples, Bananas and Oranges EXCEPT for the Polygon value in 'geometry'.

What I would like to do is merge each of the rows based on Farm Type so my DataFrame looks like:

Farm type Lat Long Avg yield Max yeild geometry
Apples x1 y1 50 100 MULTIPOLYGON (a,b,c)
Bananas x2 y2 100 150 MULTIPOLYGON (d,e,f)
Oranges x3 y3 70 100 MULTIPOLYGON (g,h,i)

Since the polygons describe the distribution area, the polygons don't always intersect so we can't assume intersection i.e. distribution area could cross over to an island; which is a separate polygon, etc.

Some code I have is:

import geopandas as gpd
from shapely.ops import unary_union

for i in df['Farm type'].unique():
    temp_poly = df.loc[df['Farm type']==i]
    df.loc[df['Farm type']==i, 'geometry'] = gpd.GeoSeries(unary_union(temp_poly['geometry']))

But it doesn't seem to be doing what I want, I think I might be slicing the data frame with .loc[] incorrectly? I am happy to provide further information upon request, but I can't share my actual data/code which is why I made a toy scenario.


Solution

  • Use geopandas.GeoDataFrame.dissolve, i.e.:

    df.dissolve("Farm type")
    

    See the user guide for more info.