Search code examples
pythongeopandasshapely

How to get the intersection of a polygon with others in the same geodataframe?


I have a shapefile that contains thousands of polygons. Lots of them touch but don't cross. I need to get the public line of the touched polygon.

I tried using the following function to achieve my purpose, but the output shows some MultiLineString with lines only having two points, which should be a whole LineString.

def calcu_intersect_lines(cgidf):

    intersection = gpd.GeoDataFrame(columns=['geometry'], crs=cgidf.crs)

    while len(cgidf) > 1:
        choose = cgidf.iloc[0]
        cgidf.drop(cgidf.index[0], inplace=True)
        for i in range(len(cgidf.index)):
            cgids = cgidf.iloc[i]
            if choose.geometry.exterior.intersects(cgids.geometry.exterior):
                intersects = choose.geometry.exterior.intersection(cgids.geometry.exterior)
                index = len(intersection)
                intersection.loc[index] = [intersects]

            else:
                continue

        return intersection

For the MultiLineString, I tried using the shapely.geometry.LineString.union() function to join two short lines in the same MultiLineString if they touch with each other. But the result shows a MultiLineString too.

geopandas itself's intersection function seems to result in a MultiLineString too.

Is there any method returning a normal result (LineString not MultiLineString for a continuous public line)?


Here is a small example of input and output data:

a = Polygon(((0, 0), (0, 0.5), (0.5, 1), (1, 0.5), (1, 0), (0.5, -0.5), (0, 0)))
b = Polygon(((0, 0.5), (0.5, 1), (1, 0.5), (1, 2), (0, 0.5)))
c = Polygon(((1, 0.5), (1, 0), (0.5, -0.5), (1.5, -1), (1, 0.5)))
gdf = gpd.GeoDataFrame(columns=['geometry'], data = [a, b, c])
h = calcu_intersect_lines(gdf)

Here are the values of h:

index  geometry
0      MULTILINESTRING ((0 0.5, 0.5 1), (0.5 1, 1 0.5))
1      MULTILINESTRING ((1 0.5, 1 0), (1 0, 0.5 -0.5))

The LineString in the two MultiLineString have public point (0.5, 1) and (1, 0), respectively.

The result I want is like the following:

index  geometry
0      LINESTRING (0 0.5, 0.5 1, 1 0.5))
1      LINESTRING (1 0.5, 1 0, 0.5 -0.5))

Possible solution:

In the comments, I was proposed to replace the following line

intersection.loc[index] = [intersects]

by

intersection.loc[index] = [LineString([*intersects[0].coords, *map(lambda x: x.coords[1], intersects[1:])])]

It works well in my simple example. However, for the true shapefile, it will be much more complicated than that. There might be the following situations:

  1. Two polygons with more than one public line.

    from shapely.geometry import Polygon
    
    a = Polygon(((0., 0.), (0., 0.5), (0.5, 1.), (1., 0.5), (1., 0.), (0.5, -0.5), (0., 0.)))
    b = Polygon(((0., 0.5), (0.5, 1.), (1.2, 0.7), (1., 0.), (0.5, -0.5), (2., 0.5), (0., 2.)))
    

    For a and b, they have two pubilc lines LineString(((0., 0.5), (0.5, 1.))) and LineString(((1., 0.), (0.5, -0.5))). In this case I can simply use intersects function to test if lines touch. But then there is another problem:

  2. The lines in a MultiLineString are not in order.

    from shapely.geometry import MultiLineString
    
    ml = MultiLineString((((2, 3), (3, 4)), ((0, 2), (2, 3))))
    

    For ml, this suggestion will return a wrong result. Do you have any idea about the 2nd example above?


Solution

  • Thanks for Georgy and other contributors' help, I have solved my problem. The function shapely.ops.linemerge() introducted in here is the key point of my solution.

    I post my solution in here:

    from shapely import ops
    
    def union_multils(ml):
    
        '''Union touched LineStrings in MultiLineString or GeometryCollection.
    
        Parameter
        ---------
        ml: GeometryCollection, MultiLineString or LineString
    
        return
        ------
        ul: MultiLineString or LineString: a MultiLineString suggest the LineStrings 
            in input ml is not connect entitly.
        '''
    
        # Drop Point and other geom_type(if exist) out
        ml = list(ml)
        ml = [l for l in ml if l.geom_type == 'LineString']
    
        # Union
        if len(ml) ==  1 and ml[0].geom_type == 'LineString':
            ul = ml[0]
        else:
            ul = ops.linemerge(ml)
    
        return ul