Search code examples
pythonnumpymachine-learningleast-squaresdata-fitting

Check if 2d points are on a curved line


My image: Image to be processed

I am trying to detect curved lines in the image - in the pictured there are stacked coins. I want to count the parallel curved lines. Most of the lines are discontinuous.

Let's say I use 5 points with numpy.polyfit and get the function that describes the line.

What would be the best approach to search for the lines and say that those points are on line one, those points are on line 2 etc.

I was thinking of trying the least-square approach and shift the line up an down. I think of the curved line as a parabola( ax^2 + bx + c ) - shifting it means shifting the vertex x=-b/2a => y=a*(-b/2a)^2 + b*(-b/2a)+c .

import numpy as np
data = np.array([[0,0], [1,-1], [2, -2], [3,-1], [4,0]])
data_x = [k[0] for k in data ]
data_y = [k[1] for k in data ]
p = np.poly1d(np.polyfit(data_x, data_y, 2))

Can please someone help me with an example how to fit points from the image to the p I just found. How do I apply the least square here?

Thanks in advance!


Solution

  • After many days of reading and digging over the internet, I've found a very elegant solution using lmfit. https://lmfit.github.io/lmfit-py/ I thank the creators for this module and for a grea job done.

    Now the solutions for fitting data to a curved line. When we have a polynomial p

    >>> p
    

    poly1d([ 0.42857143, -1.71428571, 0.05714286])

    create a python function with those params

    def fu(x,a=0.4285,b=-1.71,c=0.057):
        return x*x*a + x * b + c
    

    Now we can create a lmfit Model with that function

    >>> gmodel = Model(fu)
    >>> gmodel.param_names
    ['a', 'c', 'b']
    >>> gmodel.independent_vars
    ['x']
    

    You can see that it identifies the independent variables and the parameters. It will try to change the parameters so that the function will best fit the data.

    >>> result = gmodel.fit(y_test, x=x_test)
    >>> print(result.fit_report())
    [[Model]]
        Model(fu)
    [[Fit Statistics]]
        # function evals   = 11
        # data points      = 8
        # variables        = 3
        chi-square         = 2.159
        reduced chi-square = 0.432
        Akaike info crit   = -4.479
        Bayesian info crit = -4.241
    [[Variables]]
        a:   0.12619047 +/- 0.050695 (40.17%) (init= 0.4285)
        c:  -0.55833335 +/- 0.553020 (99.05%) (init= 0.057)
        b:  -0.52857141 +/- 0.369067 (69.82%) (init=-1.71)
    [[Correlations]] (unreported correlations are <  0.100)
        C(a, b)                      = -0.962 
        C(c, b)                      = -0.793 
        C(a, c)                      =  0.642 
    

    Dotted line is the 'initial' prediction, the red line is the best fit prediction

    Full python script:

    import matplotlib.pyplot as plt
    from lmfit import Model
    import numpy as np
    
    def fu(x,a=0.4285,b=-1.71,c=0.057):
        return x*x*a + x * b + c
    
    gmodel = Model(fu)
    print "Params" , gmodel.param_names
    print "Independent var", gmodel.independent_vars
    
    params = gmodel.make_params()
    print " Params prop", params
    
    data_test = np.array([[0,0], [1,-1.2], [2, -2], [3,-1.3], [4,0], [5,0.5], [6,0.9], [7, 1.5]])
    x_test = data_test[:,0]
    y_test = data_test[:,1]
    result = gmodel.fit(y_test, x=x_test)
    print(result.fit_report())
    plt.plot(x_test, y_test,         'bo')
    plt.plot(x_test, result.init_fit, 'k--')
    plt.plot(x_test, result.best_fit, 'r-')
    plt.show()