Search code examples
gispolygonseriesgeopandasshapely

Is it bad practice to have more than 1 geometry column in a GeoDataFrame?


I'm trying to create a GeoDataFrame with 2 zip codes per row, whose distances from each other I want to compare. I took a list of approx 220 zip codes and ran an itertools combination on them to get all combo's, then unpacked the tuples into two columns

code_combo = list(itertools.combinations(df_with_all_zip_codes['code'], 2))
df_distance_ctr = pd.DataFrame(code_combo, columns=['first_code','second_code'])

Then I did some standard pandas merges and column renaming to get the polygon/geometry column from the original geodataframe into this new one, right beside the respective zip code columns. The problem is I can't seem to get the polygon columns to be read as geometry, even after 1.) attempting to convert the dataframe to a geodataframe - AttributeError: No geometry data set yet, 2.) applying wkt.loads to the geometry column - AttributeError: 'MultiPolygon' object has no attribute 'encode' . I've tried to look for a way to convert a series to a geoseries but can't find anything on SO nor the documentation. Can anyone please point out where I'm likely going wrong?


Solution

  • Looking at the __init__ method of a GeoDataFrame at https://github.com/geopandas/geopandas/blob/master/geopandas/geodataframe.py, it looks like a GDF can only have one column at a time. The other columns you've created should still have geometry objects in them though.

    Since you still have geometry objects in each column, you could write a method that uses Shapely's distance method, like so:

    import pandas as pd
    import geopandas
    from shapely.geometry import Point
    import matplotlib.pyplot as plt
    
    lats = [-34.58, -15.78, -33.45, 4.60, 10.48]
    lons = [-58.66, -47.91, -70.66, -74.08, -66.86]
    df = pd.DataFrame(
        {'City': ['Buenos Aires', 'Brasilia', 'Santiago', 'Bogota', 'Caracas'],
         'Country': ['Argentina', 'Brazil', 'Chile', 'Colombia', 'Venezuela'],
         'Latitude': lats,
         'Longitude': lons})
    
    df['Coordinates'] = list(zip(df.Longitude, df.Latitude))
    df['Coordinates'] = df['Coordinates'].apply(Point)
    
    df['Coordinates_2'] = list(zip(lons[::-1], lats[::-1]))
    df['Coordinates_2'] = df['Coordinates_2'].apply(Point)
    
    gdf = geopandas.GeoDataFrame(df, geometry='Coordinates')
    
    
    def get_distance(row):
        distance = row.Coordinates.distance(row.Coordinates_2)
        print(distance)
        return distance
    
    gdf['distance'] = gdf.apply(lambda row: get_distance(row), axis=1)
    

    As for the AttributeError: 'MultiPolygon' object has no attribute 'encode'. MultiPolygon is a Shapely geometry class. encode is usually a method on string objects so you can probably remove the call to wkt.loads.