Search code examples
pythoncurve-fittinglmfittriangular

unexpected results while applying curve fitting to triangular pdf using lmfit


I have a set of points with x & y coordinates

x = np.array([0, 30, 60, 90, 120, 150, 180, 210, 240, 270, 300, 330, 360, 390])
y = np.array([0, 1.2, 2.8, 1.7, 1.4, 1.2, 1.1, 0.91, 0.74, 0.61, 0.5, 0.28, 0.17, 0])

the points can fairly draw a triangle, so I'm doing three-points estimation using lmfit library.

note: I know scipy.stats.triang can help me with that, but it's a bit limited because x and c should be between [0,1], and I'm already doing a lot of lmfit in my project.

the code:

import numpy as np
import matplotlib.pylab as plt
import lmfit

# defining triangular function
def triangular_pdf(x, amplitude, start, shape, end):
    model = np.zeros_like(x) # initialize model array with zeros
    mask1 = (x < start) | (x > end)
    mask2 = (x >= start) & (x < shape)
    mask3 = (x == shape)
    mask4 = (x >= shape) & (x < end)
    
    model[mask1] = 0
    model[mask2] = 2*amplitude*(x[mask2]-start) / ((end-start)*(shape-start))
    model[mask3] = 2*amplitude / (end - start)
    model[mask4] = 2*amplitude*(end-x[mask4]) / ((end-start)*(end-shape))
    model = np.where((x>=start)&(x<=end), model, 0) 
    return model

model_triangular = lmfit.Model(triangular_pdf)

params_init_triangular    = model_triangular.make_params(amplitude={'value':y.max(), 'vary':True},
                                        start={'value':x[0], 'vary':False},
                                        end={'value':x[-1], 'vary':False},
                                        shape={'value':(x[-1]+x[0])/2, 'max':x[-1], 'min':x[0]})

result     = model_triangular.fit(y, params_init_triangular, x=x)  #, nan_policy='propagate')
params_opt = result.params
print(result.fit_report())

x_samp = np.linspace(x.min(), x.max(), 300)
y_samp_model = model_triangular.eval(params_opt, x=x_samp)

plt.plot(x, y, 'o', label='points')
plt.plot(x_samp, y_samp_model, '-', label='triang')
plt.legend()
plt.show()

now when I run the thing the result:

[[Model]]
    Model(triangular_pdf)
[[Fit Statistics]]
    # fitting method   = leastsq
    # function evals   = 3
    # data points      = 14
    # variables        = 2
    chi-square         = 18.8851000
    reduced chi-square = 1.57375833
    Akaike info crit   = 8.19042290
    Bayesian info crit = 9.46853756
    R-squared          = -1.50895005
##  Warning: uncertainties could not be estimated:
    amplitude:  at initial value
    shape:      at initial value
[[Variables]]
    amplitude:  2.80000000 (init = 2.8)
    start:      0 (fixed)
    shape:      195.000000 (init = 195)
    end:        390 (fixed)

points & wrong result

now this is obviously wrong and the parameters (amplitude & shape) should vary from initial values, which didn't happen.

in order to check if the triangular_pdf & model_triangular are working I tried this:

x_test = np.linspace(0,400,50)
y_test = triangular_pdf(x_test, 1000, 0, 100*np.random.rand(), 400) + np.random.rand(1,len(x_test))[0]

result_test     = model_triangular.fit(y_test, params_init_triangular, x=x_test)#, nan_policy='propagate')
params_opt_test = result_test.params
print(result_test.fit_report())

x_samp = np.linspace(x_test.min(), x_test.max(), 300)
y_samp_model = model_triangular.eval(params_opt_test, x=x_samp)

plt.plot(x_test, y_test, 'o', label='points')
plt.plot(x_samp, y_samp_model, '-', label='triang')
plt.legend()
plt.show()

and the result was

[[Model]]
    Model(triangular_pdf)
[[Fit Statistics]]
    # fitting method   = leastsq
    # function evals   = 34
    # data points      = 50
    # variables        = 2
    chi-square         = 9.11427030
    reduced chi-square = 0.18988063
    Akaike info crit   = -81.1090828
    Bayesian info crit = -77.2850368
    R-squared          = 0.92041334
[[Variables]]
    amplitude:  1125.37941 +/- 21.3058857 (1.89%) (init = 2.8)
    start:      0 (fixed)
    shape:      92.1393307 +/- 3.07739313 (3.34%) (init = 195)
    end:        390 (fixed)

points with valid result

now the parameters (amplitude & shape) got changed, and the result is actually good, which means the method should work fine

I'm puzzled with where the problem is.

can anybody help me?


Solution

  • as @MNewville suggested the problem is with the base function, and the solution is as followed:

    from scipy import special
    
    def triangular_pdf(x, amplitude, start, shape, end):
        model1 = np.zeros_like(x)
        model2 = (x-start) / (shape-start)
        model3 = (end-x) / (end-shape)
        
        mask2 = (1-special.erf(x - shape)) / 2
        mask3 = (special.erf(x - shape)+1) / 2
        model = 2*amplitude* (model1 + model2 * mask2 + model3 * mask3) / (end - start)
        return model