Search code examples
postgisgeopandasshapelygeos

ST_Intersection behaviour


I have a linestring which traces part of it's path back over itself (a there and back trajectory). When I try to intersect this linestring with a polygon to try to get the distance travelled within the given polygon the intersects operation only keeps one leg of the duplicated part of the journey. Is there a way to keep both pieces of the trajectory to get an accurate measure of the distance travelled.

In terms of MRE I would like

select 
    ST_Length(ST_Intersection(
        ST_GeomFromText('POLYGON((-1 -1, 1 -1, 1 1, -1 1, -1 -1))'),
        ST_GeomFromText('LINESTRING(0 -1, 0 0, 1 0, 0 0, 0 1)')))

to give the result 4 rather than 3.

The only way i can seem to be able do this is to break the linestring into it's constituent segments and intersect each segment with the polygon. However this is taking a lot more compute time to complete as the trajectories can be quite long with many segments and there are ~10,000 polygons to intersect with.

I am actually doing this in Python using shapely but thought I would check with PostGIS first to see if this behaviour agreed with the OGC standard (and it does).


Solution

  • I managed to this efficiently in geopandas by doing the following

    import geopandas as gpd
    from shapely import LineString, Polygon
    
    line = LineString([[0, -2],[0, 0],[2, 0],[0, 0],[0, 2]])
    
    polygons = gpd.GeoDataFrame(geometry=[
        Polygon([[-1, -1], [-1, -3], [1, -3], [1, -1], [-1, -1]]),
        Polygon([[-1, 1], [-1, -1], [1, -1], [1, 1], [-1, 1]]),
        Polygon([[-1, 3], [-1, 1], [1, 1], [1, 3], [-1, 3]])
    ]).rename_geometry('polygon')
    polygons['code'] = ['A', 'B', 'C']
    polygons.set_index('code', inplace=True)
    
    ##################################################################
    
    points = gpd.GeoDataFrame(geometry=gpd.points_from_xy(*line.xy)).rename_geometry('point')
    points['next_point'] = points.shift(-1)['point']
    segments = gpd.GeoDataFrame(geometry = points['point'].shortest_line(points['next_point']).iloc[:-1:])
    
    s = (segments
        .sjoin(polygons)
        .set_index('code')
        .intersection(polygons, align = True))
    
    (s[s.notna()]
        .length
        .groupby(level=0).sum())