python numpy machine-learning least-squares data-fitting

Check if 2d points are on a curved line

My image: Image to be processed

I am trying to detect curved lines in the image - in the pictured there are stacked coins. I want to count the parallel curved lines. Most of the lines are discontinuous.

Let's say I use 5 points with numpy.polyfit and get the function that describes the line.

What would be the best approach to search for the lines and say that those points are on line one, those points are on line 2 etc.

I was thinking of trying the least-square approach and shift the line up an down. I think of the curved line as a parabola( ax^2 + bx + c ) - shifting it means shifting the vertex x=-b/2a => y=a*(-b/2a)^2 + b*(-b/2a)+c .

import numpy as np
data = np.array([[0,0], [1,-1], [2, -2], [3,-1], [4,0]])
data_x = [k[0] for k in data ]
data_y = [k[1] for k in data ]
p = np.poly1d(np.polyfit(data_x, data_y, 2))

Can please someone help me with an example how to fit points from the image to the p I just found. How do I apply the least square here?

Thanks in advance!

Solution

After many days of reading and digging over the internet, I've found a very elegant solution using lmfit. https://lmfit.github.io/lmfit-py/ I thank the creators for this module and for a grea job done.

Now the solutions for fitting data to a curved line. When we have a polynomial p

>>> p

poly1d([ 0.42857143, -1.71428571, 0.05714286])

create a python function with those params

def fu(x,a=0.4285,b=-1.71,c=0.057):
    return x*x*a + x * b + c

Now we can create a lmfit Model with that function

>>> gmodel = Model(fu)
>>> gmodel.param_names
['a', 'c', 'b']
>>> gmodel.independent_vars
['x']

You can see that it identifies the independent variables and the parameters. It will try to change the parameters so that the function will best fit the data.

>>> result = gmodel.fit(y_test, x=x_test)
>>> print(result.fit_report())
[[Model]]
    Model(fu)
[[Fit Statistics]]
    # function evals   = 11
    # data points      = 8
    # variables        = 3
    chi-square         = 2.159
    reduced chi-square = 0.432
    Akaike info crit   = -4.479
    Bayesian info crit = -4.241
[[Variables]]
    a:   0.12619047 +/- 0.050695 (40.17%) (init= 0.4285)
    c:  -0.55833335 +/- 0.553020 (99.05%) (init= 0.057)
    b:  -0.52857141 +/- 0.369067 (69.82%) (init=-1.71)
[[Correlations]] (unreported correlations are <  0.100)
    C(a, b)                      = -0.962 
    C(c, b)                      = -0.793 
    C(a, c)                      =  0.642

Full python script:

import matplotlib.pyplot as plt
from lmfit import Model
import numpy as np

def fu(x,a=0.4285,b=-1.71,c=0.057):
    return x*x*a + x * b + c

gmodel = Model(fu)
print "Params" , gmodel.param_names
print "Independent var", gmodel.independent_vars

params = gmodel.make_params()
print " Params prop", params

data_test = np.array([[0,0], [1,-1.2], [2, -2], [3,-1.3], [4,0], [5,0.5], [6,0.9], [7, 1.5]])
x_test = data_test[:,0]
y_test = data_test[:,1]
result = gmodel.fit(y_test, x=x_test)
print(result.fit_report())
plt.plot(x_test, y_test,         'bo')
plt.plot(x_test, result.init_fit, 'k--')
plt.plot(x_test, result.best_fit, 'r-')
plt.show()