Converting an open curve to a list of ordered pixels: a Python test code with numpy

I have the image of an open curve in a numpy array and I need to build a list of points coordinates ordered according to their position on the curve. I wrote a draft script using numpy and mahotas. It may not be optimal.

I know that OpenCV can do this for a closed curve. Can OpenCV do the same (faster) with an open curve?

For example, if the original curve is:

[[0 0 0 0 0 0 0]
 [0 1 0 0 1 0 0]
 [0 0 1 0 0 1 0]
 [0 0 0 1 1 0 0]
 [0 0 0 0 0 0 0]]

Using np.where(myarray==1), I can get the indices of the pixels:

(array([1, 1, 2, 2, 3, 3]), array([1, 4, 2, 5, 3, 4]))

But this not what I need. My script yields the indices taking into account the order of the pixels on the curve:

i= 0 ( 1 , 1 )
i= 1 ( 2 , 2 )
i= 2 ( 3 , 3 )
i= 3 ( 3 , 4 )
i= 4 ( 2 , 5 )
i= 5 ( 1 , 4 )

I would like to optimize my script. Any ideas?

Solution

Assuming that only one curve is present in the matrix/image and that each point on the curve has between 1 and 2 neighbours, the following function will provide the results you need.

It works by taking the point closest to the top left hand corner, and forming a chain of points by iteratively finding the closest point that has not already been visited until no further points remain. For a closed curve, the squared Euclidean distance between the first/final points on the chain will be less than than 2.

import numpy as np

def find_chain(mat):
  locs=np.column_stack(np.nonzero(mat))
  chain=[np.array([0,0])]
  while locs.shape[0]>0:
    dists=((locs-np.vstack([chain[-1]]*locs.shape[0]))**2).sum(axis=1)
    next=dists.argmin()
    if dists.min()<=2 or len(chain)==1:
      chain.append(locs[next,:])
      locs=locs[np.arange(locs.shape[0])!=next,:]
    else:
      chain=[chain[0]]+chain[1::][::-1]
  return np.vstack(chain[1::]),((chain[1]-chain[-1])**2).sum()<=2

For an open curve:

>>> mat1=np.array([[0, 0, 0, 0, 0, 0, 0],
...                [0, 1, 0, 0, 1, 0, 0],
...                [0, 0, 1, 0, 0, 1, 0],
...                [0, 0, 0, 1, 1, 0, 0],
...                [0, 0, 0, 0, 0, 0, 0]])
>>> points,isclosed=find_chain(mat1)
>>> points
array([[1, 1],
       [2, 2],
       [3, 3],
       [3, 4],
       [2, 5],
       [1, 4]])
>>> isclosed
False

And for a closed curve:

>>> mat2=np.array([[0, 0, 0, 0, 0],
...                [0, 0, 1, 0, 0],
...                [0, 1, 0, 1, 0],
...                [0, 1, 0, 1, 0],
...                [0, 0, 1, 0, 0],
...                [0, 0, 0, 0, 0]])
>>> points,isclosed=find_chain(mat2)
>>> points
array([[1, 2],
       [2, 1],
       [3, 1],
       [4, 2],
       [3, 3],
       [2, 3]])
>>> isclosed
True

And a curve where the starting point (closest point to the origin) splits the curve in two.

>>> mat3=np.array([[0, 0, 0, 0, 0],
...                [0, 1, 1, 1, 0],
...                [0, 1, 0, 0, 0],
...                [0, 1, 0, 0, 0],
...                [0, 0, 0, 0, 0],
...                [0, 0, 0, 0, 0]])
>>> points,isclosed=find_chain(mat3)
>>> points
array([[1, 3],
       [1, 2],
       [1, 1],
       [2, 1],
       [3, 1]])
>>> isclosed
False