Search code examples
pythoncontourimage-recognitionsurface

How to quickly gather a list of points by following adjacent criterium


I have a list of points L=[[x1,y1],[x2,y2],...] and I would like to build a list S=[L1,L2,...] of "surface" generated by gathering the neighbor points of L. The type of "surface" is the same as the type of L, that is to say, a list of points (but that constitute a surface based on neighbor linking). However, what I have tried to do is not fast enough.

I used a recursive function F(L, P) that requires a list of points L, and the start point P=[x,y] (which has to be removed from the list L when the function is called). Then, it looks for all the neighbor points of P and callback the function on each one of them if they exist (after removing them from L). The base case is reached when the point tested has no longer neighbor points.

Thus, when all the base case is reached, F(L, P) returns a list of points L1 that constitute the surface associated to P. I then repeat the process for the remaining points of L and so on to build L2,L3,....

def F(L,P):   
    nhList=[]
    leftP=[P[0]+1,P[1]]
    rightP=[P[0]-1,P[1]]
    upP=[P[0],P[1]-1]
    dwP=[P[0],P[1]+1] 

    if(upP in L):
        L.remove(upP)
        nhList.append(upP)
    if(dwP in L):
        L.remove(dwP)
        nhList.append(dwP)
    if(leftP in L):
        L.remove(leftP)
        nhList.append(leftP)
    if(rightP in L):
        L.remove(rightP)
        nhList.append(rightP)

    if(nhList!=[]):
        rtList=[P]
        for ad in nhList:
            e=F(L,ad)
            rtList+=e
        return rtList
    else:
        return [P]

L=[[0,0],[1,0],[5,3],[1,1],[5,4],[2,2]] # e.g.
S=[]
while(L!=[]):
    P=L.pop()
    S.append(F(L,P))
print(S)
# Returns [[2, 2]], [[5, 4], [5, 3]], [[1, 1], [1, 0], [0, 0]] as expected

I expect to retrieve the list S as explained in the intro, and it works well. However, when I use this process on a bigger list of points (that contains 1M points for instance) it slows down the processing and sometimes, I even reach the recursion limit.

Therefore, I'm looking to generate the list S faster.


Solution

  • I think you can enhance the performance by following ideas:

    1. in big data to avoid recursion limit, you can use iteration instead of recursion
    2. to enhance the perfomance of query and modify in L, you can preprocess L into a set
    3. for algorithm, BFS is suitable here.

    here is my solution:

    from collections import deque
    
    L = [[0, 0], [1, 0], [5, 3], [1, 1], [5, 4], [2, 2]]  # e.g.
    # convert to set for query and modify performance, change list to immutable tuple.
    L = set(map(tuple, L))
    
    S = []
    while L:
        # start from a random point
        start = L.pop()
        queue, visited, cur_s = deque([start]), set([start]), [start]
    
        while queue:
            node = queue.popleft()
            visited.add(node)
            i, j = node
            # find the 4-adjacent neighbors
            for neighbor in (i + 1, j), (i - 1, j), (i, j + 1), (i, j - 1):
                if neighbor in L and neighbor not in visited:
                    queue.append(neighbor)
                    cur_s.append(neighbor)
                    L.remove(neighbor)
        S.append(cur_s)
    
    print(S)
    

    output:

    [[(5, 4), (5, 3)], [(0, 0), (1, 0), (1, 1)], [(2, 2)]]
    

    Hope that helps you, and comment if you have further questions. : )