Search code examples
scipyintegral

Fitting Integral to data


I am trying to fit data to a function f(x) that is an integral over T. x is the upper boarder of the Integral. I am trying to do it with scipy.curve_fit() but I don't know how to write my integral as a function that can be passed to curve_fit. I had a look at similar problems but I didn't see anything that fits to my problem.

I cannot provide any guess values for A and Ea since I have no clue at all what they could be as of now.

from scipy import optimize
import matplotlib.pyplot as plt
import numpy as np
from scipy import interpolate
from scipy import integrate

class get_Ton:
 def __init__(self):
  self.data=np.genfromtxt('test3.csv', delimiter=',', skip_header=8)
 def loop(self):
    def fit_tangent():
      tck = interpolate.splrep(self.x, self.y, k=2, s=0)
      dev_1 = interpolate.splev(self.x, tck, der=1)

      def integrand(T, A, Ea):
        return A*np.exp(-Ea/(8.314*T))
      def polyn(x, A, Ea):
        return integrate.quad(integrand, 25, x, args=(A, Ea))[0]

      vcurve = np.vectorize(polyn)
      p, e = optimize.curve_fit(vcurve, self.x, self.y, [2000, 150])
      xd = np.linspace(50, 70, 100)

      plt.plot(self.x, self.y, "o")
      plt.plot(xd, vcurve(xd, *p))

    for i in range((list(np.shape(self.data)[1:2]))[0]):
        if i % 2 == 0:
         self.temp=self.data[:,i]
         self.scat=self.data[:,i+1]
         self.x=[26.192, 26.861000000000001, 27.510000000000002, 28.178000000000001, 28.856000000000002, 29.515000000000001, 30.183, 30.823, 31.5, 32.158999999999999, 32.856000000000002, 33.515000000000001, 34.145000000000003, 34.823, 35.491, 36.168999999999997, 36.837000000000003, 37.533999999999999, 38.164000000000001, 38.832000000000001, 39.481000000000002, 40.158999999999999, 40.826999999999998, 41.496000000000002, 42.164000000000001, 42.832000000000001, 43.500999999999998, 44.188000000000002, 44.837000000000003, 45.505000000000003, 46.173000000000002, 46.832000000000001, 47.500999999999998, 48.188000000000002, 48.828000000000003, 49.496000000000002, 50.173999999999999, 50.813000000000002, 51.481000000000002, 52.112000000000002, 52.808999999999997, 53.439, 54.116, 54.765999999999998, 55.453000000000003, 56.101999999999997, 56.761000000000003, 57.429000000000002, 58.078000000000003, 58.737000000000002, 59.442999999999998, 60.082999999999998, 60.770000000000003, 61.448, 62.125999999999998, 62.756, 63.414999999999999, 64.082999999999998, 64.742000000000004, 65.420000000000002, 66.087999999999994, 66.747, 67.415000000000006]
         self.y=[1553.5, 1595.0, 1497.8, 1695.5999999999999, 1328.7, 1279.0, 1547.8, 1412.8, 1037.0, 1473.5, 1447.4000000000001, 1532.5999999999999, 1484.2, 1169.5, 1395.2, 1183.5999999999999, 949.01999999999998, 1238.0999999999999, 1225.5999999999999, 924.80999999999995, 1650.5999999999999, 803.96000000000004, 1245.7, 1190.0, 1207.0, 1294.0, 1174.9000000000001, 1229.8, 1260.0, 1129.2, 1142.9000000000001, 987.63999999999999, 1389.5999999999999, 1366.0, 1102.0999999999999, 1325.5, 1258.9000000000001, 1285.7, 1217.5, 871.47000000000003, 820.24000000000001, 1388.7, 1391.0, 1400.3, 2482.5999999999999, 3360.5999999999999, 7013.5, 11560.0, 16525.0, 22538.0, 32556.0, 43878.0, 59093.0, 67977.0, 75949.0, 82316.0, 90213.0, 90294.0, 99928.0, 128240.0, 181280.0, 226380.0, 223260.0]
         fit_tangent()
         plt.ylim((-100,1000000))
         plt.show()

def main():
    this=get_Ton()
    this.loop()

if __name__ == "__main__":
    main()

Solution

  • Three issues here. First, the function polyn does not depend on the variable of integration T, since this variable is integrated out. Remove T from its list of parameters. Accordingly, drop one of numerical values in trueydata computation:

    trueydata = vcurve(truexdata, 3, 4) 
    

    Second, quad returns a tuple (integral_value, integral_error). Use [0] to return only the integral value.

    def polyn(x, A, Ea):
        return integrate.quad(integrand, 25, x, args=(A, Ea))[0]
    

    Third, provide an initial guess for parameter values to curve_fit. If you don't, it will likely report unable to determine the number of parameters to fit. Even if successful, it will blindly use all-ones for the initial guess. An initial guess supplied by a human with an understanding of the optimization problem is often the difference between success and failure of multivariable optimization.

    popt, pcov = optimize.curve_fit(vcurve, xdata, ydata, [2, 3])