Search code examples
pythonpolygonshapely

How to extract zip codes from multipolygons?


I have a geodataframe containing utility company service areas. Each geodataframe has a multipolygon to represent the area to which the company provides service. I am trying to build a dataframe where these multipolygons can be represented by zip codes.

The dataframe looks like this:enter image description here

The goal would be for example the first utility company, if the geometry overlaps 10 zip codes, the new dataframe would have ten rows with the same utility company name and ID.

I have reverse geocoded single longitude and latitude coordinates before, but have never worked with polygons. Most of the resources on the web involve turning zip codes into polygons, not the other way around.

Edit: zip code geodataframe belowenter image description here


Solution

  • So basically all you need to do is use gpd.sjoin. If you only want the zip code from the other data frame, you would just use something like this:

    with_zip = gpd.sjoin(utility_gdf,zipcode_gdf[['ZIP_CODE','geometry']],how='left',op='intersects')
    

    See for further reference: Merging Data - GeoPandas

    Edit:

    After looking at the geometries, the two datasets are in fact using different coordinate reference systems. Getting the two crs to match up is a 2 step process;

    1. Transform the crs
    2. Set the crs type on the geoseries

    But first, you need to figure out what crs each data set is currently using. To find what crs each gdf is using, just type

    gdf.geometry.crs
    

    If either of the datasets have a 'NoneType' crs, you're going to have to do some googling to figure out what crs it actually is using.

    Once you figure out which crs you have, you then transform it. Here's a pretty good thread on transforming crs: GIS stack exchange thread

    Then once you have the actual geometry datapoints transformed to a new crs, you then need to set the appropriate crs type of the geoseries. For instance, if you transformed gdf1.geometry from "EPSG:2966" to "EPSG:4236", you would then call:

    gdf1.set_crs("EPSG:4236",inplace=True,allow_override=True)
    

    And then you can retry the merge operation.