Search code examples
pythongeometryshapelyline-segment

Find connected branches from list of line segments


Problem

I have a list of line segments:

exampleLineSegments = [(1,2),(2,3),(3,4),(4,5),(5,6),(4,7),(8,7)]

These segments include the indices of the corresponding point in a separate array.

From this sublist, one can see that there is a branching point (4). So three different branches are emerging from this branching point. (In other, more specific problems, there might be / are multiple branching points for n branches.)

Target

My target is to get a dictionary including information about the existing branches, so e.g.:

result = { branch_1: [1,2,3,4],
           branch_2: [4,5,6],
           branch_3: [4,7,8]}

Current state of work/problems

Currently, I am identifying the branch points first by setting up a dictionary for each point and checking for each entry if there are more than 2 neighbor points found. This means that there is a branching point.

Afterwards I am crawling through all points emerging from these branch points, checking for successors etc.

In these functions, there are a some for loops and generally an intensive "crawling". This is not the cleanest solution and if the number of points increasing, the performance is not so good either.

Question

What is the best / fastest / most performant way to achieve the target in this case?


Solution

  • I think you can achieve it by following steps:

    1. use a neighbors dict to store the graph
    2. find all branch points, which neighbors count > 2
    3. start from every branch point, and use dfs to find all the paths
    from collections import defaultdict
    def find_branch_paths(exampleLineSegments):
        # use dict to store the graph
        neighbors = defaultdict(list)
        for p1, p2 in exampleLineSegments:
            neighbors[p1].append(p2)
            neighbors[p2].append(p1)
    
        # find all branch points
        branch_points = [k for k, v in neighbors.items() if len(v) > 2]
    
        res = []
    
        def dfs(cur, prev, path):
            # reach the leaf
            if len(neighbors[cur]) == 1:
                res.append(path)
                return
            for neighbor in neighbors[cur]:
                if neighbor != prev:
                    dfs(neighbor, cur, path + [neighbor])
    
        # start from all the branch points
        for branch_point in branch_points:
            dfs(branch_point, None, [branch_point])
    
        return res
    

    update an iteration version, for big data, which may cause a recursion depth problem:

    def find_branch_paths(exampleLineSegments):
        # use dict to store the graph
        neighbors = defaultdict(list)
        for p1, p2 in exampleLineSegments:
            neighbors[p1].append(p2)
            neighbors[p2].append(p1)
    
        # find all branch points
        branch_points = [k for k, v in neighbors.items() if len(v) > 2]
    
        res = []
        # iteration way to dfs
        stack = [(bp, None, [bp]) for bp in branch_points]
        while stack:
            cur, prev, path = stack.pop()
    
            if len(neighbors[cur]) == 1 or (prev and cur in branch_points):
                res.append(path)
                continue
    
            for neighbor in neighbors[cur]:
                if neighbor != prev:
                    stack.append((neighbor, cur, path + [neighbor]))
    
        return res
    

    test and output:

    print(find_branch_paths([(1, 2), (2, 3), (3, 4), (4, 5), (5, 6), (4, 7), (8, 7)]))
    # output: 
    # [[4, 3, 2, 1], [4, 5, 6], [4, 7, 8]]
    

    Hope that helps you, and comment if you have further questions. : )

    UPDATE: if there are many branch points, the path will grow exponentially. So if you only want distinct segments, you can end the path when encounter another branch point.

    change this line
    if len(neighbors[cur]) == 1:
    to
    if len(neighbors[cur]) == 1 or (prev and cur in branch_points):