Search code examples
python-3.5data-fitting

Fitting erf function to data using python


I have read about the erf function from this article.

I have also tried implementing the code provided in this thread.

But considering the definition of the erf function, I am not understanding how to implement the fit to the following data:

X   Y
 33     21.09
 35     21.14
 37     21.21
 39     21.32
 41     21.46
 43     23
 45     27
 47     31
 49     36
 51     40
 53     45
 55     49
 57     53
 59     57
 61     61
 63     65
 65     69
 67     73
 69     77
 71     78
 73     79
 75     79

Edit:

I changed the data so that it is not step anymore. The data plot looks as:

enter image description here

This plot should either fit sigmoid, tanh, etc. function. But using the code in the comment below I don't get a fit.


Solution

  • You can fit an erf quite easily. The solution in the comment above does not work as with the revised question the data has changed in such a way that the provided starting valued do not work. Nevertheless, the curvature when going from slope to horizontal is very large, i.e. the transition is very sharp such that neither erf, tanh, nor sigmoid work. Below I'll show the fits as well as some alternative/modified functions that fit the data better.

    import matplotlib.pyplot as plt
    import numpy as np
    from scipy.special import erf
    from scipy.optimize import curve_fit
    
    data = np.array( [
        33, 21.09,
        35, 21.14,
        37, 21.21,
        39, 21.32,
        41, 21.46,
        43, 23,
        45, 27,
        47, 31,
        49, 36,
        51, 40,
        53, 45,
        55, 49,
        57, 53,
        59, 57,
        61, 61,
        63, 65,
        65, 69,
        67, 73,
        69, 77,
        71, 78,
        73, 79,
        75, 79 ] )
    
    xData = data[ ::2 ]
    yData = data[ 1::2 ]
    
    
    def sign_pow( x, p):
        return np.copysign( abs( x )**p, x )
    
    
    guessErf = [ 32., 55., 10., 20. ]
    solErf , pcov = curve_fit( lambda x, a, b, c, d:  a * (1 + erf( ( x- b ) / c ) ) + d , xData, yData, p0=guessErf )
    print solErf
    
    guessTh = [ 32., 55., 10., 20. ]
    solTh , pcov = curve_fit( lambda x, a, b, c, d:  a * (1 + np.tanh( ( x- b ) / c ) ) + d , xData, yData, p0=guessTh )
    print solTh
    
    guessThP = [ 32., 55., 10., 20., 1 ]
    solThP , pcov = curve_fit( lambda x, a, b, c, d, p:  a * (1 + sign_pow( np.tanh( sign_pow( ( x- b ) / c , p ) ) , (1./p) ) ) + d , xData, yData, p0=guessThP )
    print solThP
    
    
    guessLz = [ 32., 55., 10., 20.]
    solLz , pcov = curve_fit( lambda x, a, b, c, d:  a * ( 1 + (x - b) / np.sqrt( 1 + ( x - b )**2 / c ) )+ d , xData, yData, p0=guessLz )
    print solLz
    
    guessLzP = [ 2., 55., 200., 47., 2]
    solLzP , pcov = curve_fit( lambda x, a, b, c, d, p:  a * ( 1 + (x - b) / ( 1 + abs( ( x - b ) / c )**p )**( 1. / p ) ) + d , xData, yData, p0=guessLzP, maxfev=5000 )
    print solLzP
    #~ solLzP=guessLzP
    
    
    testX = np.linspace( 30, 80, 150 )
    
    erfList = [ solErf[0] * ( 1 + erf( ( x - solErf[1] ) / solErf[2] ) ) + solErf[3] for x in testX ]
    thList = [ solTh[0] * ( 1 + np.tanh( ( x - solTh[1] ) / solTh[2] ) ) + solTh[3] for x in testX ]
    thPList = [ solThP[0] * ( 1 + sign_pow( np.tanh( sign_pow( ( x - solThP[1] ) / solThP[2] , solThP[4] ) ) , 1. / solThP[4] ) ) + solThP[3] for x in testX ]
    lzList = [ solLz[0] * ( 1 + (x - solLz[1] ) / np.sqrt( 1 + ( x - solLz[1] )**2 / solLz[2] ) ) + solLz[3] for x in testX ]
    lzPList = [ solLzP[0] * ( 1 + (x - solLzP[1] ) / ( 1 + abs( ( x - solLzP[1] ) / solLzP[2] )**solLzP[4] )**( 1. / solLzP[4] ) ) + solLzP[3] for x in testX ]
    
    fig = plt.figure()
    ax = fig.add_subplot( 1, 1, 1 )
    ax.plot( xData, yData , ls='', marker='d', label='data')
    ax.plot( testX, erfList, label='erf')
    ax.plot( testX, thList, label='tanh' )
    ax.plot( testX, thPList, label= 'tanh power' )
    ax.plot( testX, lzList, label= 'x/1+x^2' )
    ax.plot( testX, lzPList, label= 'x/1+x^P' , ls='--')
    ax.legend( loc=0 )
    plt.show()
    

    Which provides:

    >> [31.57905297 55.87319052 14.64143581 18.66393531]
    >> [33.34190267 55.90472589 13.54119241 16.97367616]
    >> [28.77092734 55.75011551 13.6999564  21.16422056  7.22849049]
    >> [  2.49108137  55.93839349 232.504463    47.90421787]
    >> [  2.09892962  55.75526439 -13.72758215  47.8461409   17.7915103 ]
    

    some fits

    One can easily see the curvature problem. The modified tanh and the x/(1+x**p)**(1/p) fit equally well.