Search code examples
pythongeospatialgeopandas

Aggregating linestring geometry data to a grid


I am trying to aggregate data for road segments to a fixed grid using geopandas.

The road data looks like this:

|Variable | geometry |
|---------|----------|
| 59  |LINESTRING (440735.351 4319767.843, 440733.320...|
| 48  |LINESTRING (440733.320 4319607.463, 440732.508...|
| 64  |LINESTRING (440858.641 4329887.089, 440853.551...|
| 64  |LINESTRING (440920.578 4330030.844, 440890.661...|
| 68  |LINESTRING (218573.705 4257347.137, 218586.697...|   

The fixed grid I want to aggregate the road data appears as:

|geometry|
|--------|
| POLYGON ((238749.978 4498509.611, 238898.825 4..|
| POLYGON ((240086.217 4498360.636, 240234.824 4... |
| POLYGON ((241421.845 4498211.925, 241570.857 4..|
| POLYGON ((242758.152 4498063.434, 242906.923 4... |
| POLYGON ((244094.479 4497914.762, 244243.009 4... |

I am currently trying to weight the variable by the road length, apply a spatial join, multiplying the weighted by the road length within the grid cell and summing over the grid for all roads using this code:

gdf_road['weighted_variable'] = gdf_road['Variable']/ gdf_road.geometry.legnth
gdf_joined = gpd.sjoin(gdf_road, gdf_grid, how="inner", op='intersects')
gdf_grid['agg_variable']=(gdf_joined['Weighted_Variable'] * gdf_joined.geometry.length).groupby(gdf_joined['index_right']).sum()

The result of this code is data that maintains the spatial structure of the original road data, but the magnitude of the variable seems erroneously high. I am curious if there is something I may have overlooked in my logic of aggregating road data to a grid -- perhaps in double counting overlapping roads, not applying another type of weighting scheme...etc. Any suggestions would be great. Thank you!


Solution

  • gdf.overlay seemed to do the trick

    gdf_road['weighted_variable'] = gdf_road['Variable']/ gdf_road.geometry.legnth
    gdf_join = gpd.overlay(gdf_road, gdf_grid, how="intersection")
    gdf_join['new_length'] =  gdf_join.to_crs('EPSG:26916').geometry.length
    gdf_join['new_variable'] = gdf_join['weighted_variable'] * gdf_join['new_length']
    
    # Spatial join & aggregate data 
    joined_data = gpd.sjoin(gdf_join, gdf_grid, how="inner", op="within")
    grouped_data = joined_data.groupby('index_right')
    agg_data = grouped_data[['new_variable']].sum().reset_index()