I have the following structure (this isn't the exact data but a reflection of it form) of data where Lat, Long
is the location of the farm; and the distribution area is the GeoPandas geometry
column/info:
Farm type | Lat | Long | Avg yield | Max yeild | geometry |
---|---|---|---|---|---|
Apples | x1 | y1 | 50 | 100 | POLYGON (a) |
Apples | x1 | y1 | 50 | 100 | POLYGON (b) |
Apples | x1 | y1 | 50 | 100 | POLYGON (c) |
Bananas | x2 | y2 | 100 | 150 | POLYGON (d) |
Bananas | x2 | y2 | 100 | 150 | POLYGON (e) |
Bananas | x2 | y2 | 100 | 150 | POLYGON (f) |
Oranges | x3 | y3 | 70 | 100 | POLYGON (g) |
Oranges | x3 | y3 | 70 | 100 | POLYGON (h) |
Oranges | x3 | y3 | 70 | 100 | POLYGON (i) |
As you can see the descriptive information is the same for each instance of Apples
, Bananas
and Oranges
EXCEPT for the Polygon value in 'geometry'
.
What I would like to do is merge each of the rows based on Farm Type
so my DataFrame looks like:
Farm type | Lat | Long | Avg yield | Max yeild | geometry |
---|---|---|---|---|---|
Apples | x1 | y1 | 50 | 100 | MULTIPOLYGON (a,b,c) |
Bananas | x2 | y2 | 100 | 150 | MULTIPOLYGON (d,e,f) |
Oranges | x3 | y3 | 70 | 100 | MULTIPOLYGON (g,h,i) |
Since the polygons describe the distribution area, the polygons don't always intersect so we can't assume intersection i.e. distribution area could cross over to an island; which is a separate polygon, etc.
Some code I have is:
import geopandas as gpd
from shapely.ops import unary_union
for i in df['Farm type'].unique():
temp_poly = df.loc[df['Farm type']==i]
df.loc[df['Farm type']==i, 'geometry'] = gpd.GeoSeries(unary_union(temp_poly['geometry']))
But it doesn't seem to be doing what I want, I think I might be slicing the data frame with .loc[]
incorrectly?
I am happy to provide further information upon request, but I can't share my actual data/code which is why I made a toy scenario.
Use geopandas.GeoDataFrame.dissolve
, i.e.:
df.dissolve("Farm type")
See the user guide for more info.