Search code examples
pythonscipyintegraldata-fitting

Fitting data with integral function


When using curve_fit from scipy.optimize to fit a some data in python, one first defines the fitting function (e.g. a 2nd order polynomial) as follows:

  1. def f(x, a, b): return a*x**2+b*x
  2. And then proceeds with the fitting popt, pcov = curve_fit(f,x,y)

But the question is now, how does one go about defining the function in point 1. if the function contains an integral (or a discrete sum), e.g.:

enter image description here

The experimental data is still given for x and f(x), so point 2. would be similar I imagine once I can define f(x) in python. By the way I forgot to say that it is assumed that g(t) has a well known form here, and contains the fitting parameters, i.e. parameters like a and b given in the polynomial example. Any help is much appreciated. The question is really supposed to be a generic one, and the functions used in the post are just random examples.


Solution

  • Here's an example of fitting a curve defined in terms of an integral. The curve is the integral of sin(t*w)/t+p over t from 0 to Pi. Our x data points correspond to w, and we're adjusting the p parameter to to get the data to fit.

    import math, numpy, scipy.optimize, scipy.integrate
    
    def integrand(t, args):
        w, p = args
        return math.sin(t * w)/t + p
    
    def curve(w, p):
        res = scipy.integrate.quad(integrand, 0.0, math.pi, [w, p])
        return res[0]
    
    vcurve = numpy.vectorize(curve, excluded=set([1]))
    
    truexdata = numpy.asarray([0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0])
    trueydata = vcurve(truexdata, 1.0)
    
    xdata = truexdata + 0.1 * numpy.random.randn(8)
    ydata = trueydata + 0.1 * numpy.random.randn(8)
    
    popt, pcov = scipy.optimize.curve_fit(vcurve,
                                          xdata, ydata,
                                          p0=[2.0])
    print popt
    

    That'll print something out fairly close to 1.0, which is what we used as p when we created the trueydata.

    Note that we use numpy.vectorize on the curve function to produce a vectorized version compatible with scipy.optimize.curve_fit.