Search code examples
pythoncurve-fittingleast-squaresdata-fitting

Simultaneous data fitting in python with leastsq


I didn't program for a long time and never was good at it, but it is kind of important task I am struggling with. I am trying to fit two sets of data (x – time, y1 and y2 – different columns of values which should be read from text file). For each dataset (y1 and y2) I have a function which should fit them. Inside both of the functions I have several parameters to be fitted. For some time values the data for "y" is absent, so the task is to program it somehow when “y” is missing and fit the data without these values. Parameters should fit both functions, so it should be simultaneous fitting. And that's the problem I cannot solve.

To define the function in my case it is necessary to use Inverse Laplace Transform, so I used and there are not questions about that.

import numpy as np
import pylab as plt
from scipy.optimize import leastsq
from cmath import *

# Here is the Laplace functions 

def Fp(s, td, m0, kon, koff):
   gs=s+kon-kon*koff/(s+koff)
   sr=np.sqrt(gs*td)
   return (m0/(s*s))*sr/sinh(sr)

def Fd(s, td, m0, kon, koff):
   gs=s+kon-kon*koff/(s+koff)
   sr=np.sqrt(gs*td)
   fu=koff/(koff+kon)
   fs=fu+koff*(1-fu)/(s+koff)
   return (m0/s)*fs*2*tanh(0.5*sr)/sr

# Define the trig functions cot(phi) and csc(phi)

def cot(phi):
   return 1.0/tan(phi)
def csc(phi):
   return 1.0/sin(phi)

# inverse Laplace transform for two functions which are going to be fitted next

def Qpt(t, td, m0, kon, koff): 
    shift = 0.1;
    ans = 0.0;
    N=30
    h = 2*pi/N;
    c1 = 0.5017
    c2 = 0.6407
    c3 = 0.6122
    c4 = 0. + 0.2645j

    for k in range(0,N):
         theta = -pi + (k+1./2)*h;
         z = shift + N/t*(c1*theta*cot(c2*theta) - c3 + c4*theta); 
         dz = N/t*(-c1*c2*theta*(csc(c2*theta)**2)+c1*cot(c2*theta)+c4);
         ans += exp(z*t)*Fp(z, td, m0, kon, koff)*dz;
    return ((h/(2j*pi))*ans).real

def Qdt(t,td, m0, kon, koff): 
    shift = 0.1;
    ans = 0.0;
    N=30
    h = 2*pi/N;
    c1 = 0.5017
    c2 = 0.6407
    c3 = 0.6122
    c4 = 0. + 0.2645j
    for k in range(0,N):
        theta = -pi + (k+1./2)*h;
        z = shift + N/t*(c1*theta*cot(c2*theta) - c3 + c4*theta); 
        dz = N/t*(-c1*c2*theta*(csc(c2*theta)**2)+c1*cot(c2*theta)+c4);
        ans += exp(z*t)*Fd(z, td, m0, kon, koff)*dz;
    return ((h/(2j*pi))*ans).real

#now we have Qp and Qd as theoretical functions

I compiled that and asked a program several values, Qp and Qd are defined correctly. The only question about part above: is it possible somehow to define both functions at the same time without doing transform twice?

Then I added part with simultaneous fitting of this functions and it gave me a mistake:

TypeError: only length-1 arrays can be converted to Python scalars 

So this is my fitting part:

# FITTING PART

def residuals(pars, t, qd, qp):
    td = np.array(pars[0])
    m0 = np.array(pars[1])
    kon = np.array(pars[2])
    koff = np.array(pars[3])
    diff1 = Qdt(t,td, m0, kon, koff) - qd
    diff2 = Qpt(t,td, m0, kon, koff) - qp
    return np.concatenate((diff1[np.where(qd!=-1)], diff2[np.where(qp!=-1)]))

# for both functions with all the values
t = np.array([0.5, 2, 5, 10, 15, 20, 30, 40, 60, 90, 120, 180])
qd = np.array([0.145043746,0.273566338,0.437829373,0.637962531,-1,0.898107567,-1,1.186340492,1.359184345,-1,1.480552058,1.548143954])
qp = np.array([-1,-1,0.002701867,0.006485195,0.014034067,-1,0.06650739,-1,0.309055933,0.645945584,1.000811933,-1])


# initial values 
par_init = np.array([1, 1, 1, 1])

best, cov, info, mesg, ier = leastsq(residuals, par_init, args=(t, qd, qp), full_output=True)

print(" best-fit parameters: ", best)


#for each function separately to plot them and fitted functions as well
xqd= [0.5, 2, 5, 10, 20, 40, 60, 120, 180]
xqp= [5, 10, 15, 30, 60, 90, 120]
yqd= [0.145043746, 0.273566338, 0.437829373, 0.637962531, 0.898107567, 1.186340492, 1.359184345, 1.480552058, 1.548143954]
yqp= [0.002701867, 0.006485195, 0.014034067, 0.06650739, 0.309055933, 0.645945584, 1.000811933]

tt=np.linspace(0,185,100)
qd_fit=Qdt(tt,best[0], best[1], best[2], best[3])
qp_fit=Qdp(tt,best[0], best[1], best[2], best[3])

plt.plot(xqd,yqd,'bD:',xqp,yqp,'r^:', tt,qd_fit,'b',tt,qp_fit,'r')

plt.grid()
plt.show()

I would appreciate any help and advices! I desperately need to get it of this mistake!

Thanks in advance!


Solution

  • to fit multiple model functions to different data sets simultaneously, you need to have your residuals function do the following:

    1. have all the variable parameters into 1 nd-array -- the second argument to leastsq()
    2. receive all the data to be fitted (here, x, y1, and y2) -- must match the "args" argument of leastsq()
    3. unpack the "current parameter value" array (residuals 1st argument)
    4. calculate the sub-residual for each dataset with the values of the appropriate parameters
    5. concatenate the results to a single 1d-array for the return value

    A simple residual might look like this:

    import numpy as np
    from scipy.optimize import leastsq
    
    def residual_two_functions(pars, x, y1, y2):
        off1   = pars[0]
        slope1 = pars[1]
        off2   = pars[2]
        slope2 = pars[3]
        diff1 = y1 - (off1 + slope1 * x)
        diff2 = y2 - (off2 + slope2 * x)
        return np.concatenate((diff1, diff2))
    
    # create two tests data sets
    NPTS = 201
    x  = np.linspace(0, 10, NPTS)
    y1 = -0.7 +  1.7*x + np.random.normal(scale=0.01, size=NPTS)
    y2 =  5.2 + 50.1*x + np.random.normal(scale=0.02, size=NPTS)
    
    # initial values 
    par_init = np.array([0, 1, 10, 100])
    
    best, cov, info, message, ier = leastsq(residual_two_functions,
                                            par_init, args=(x, y1, y2),
                                            full_output=True)
    
    print(" Best-Fit Parameters: ",  best)