Search code examples
python-2.7matplotlibcurve-fittingbest-fit-curve

Python curve fitting on a barplot


How do I fit a curve on a barplot?

I have an equation, the diffusion equation, which has some unknown parameters, these parameters make the curve larger, taller, etc. On the other hand I have a barplot coming from a simulation. I would like to fit the curve on the barplot, and find the best parameters for the curve, how can I do that?

This is what I obtained by 'manual fitting', so basically I changed manually all the parameters for hours. However is there a way to do this with python?

Attempt of curve fitting

To make it simple, imagine I have the following code:

import matplotlib.pyplot as plt


list1 = []
for i in range(-5,6):
    list1.append(i)
    
width = 1/1.5

list2 = [0,0.2,0.6,3.5,8,10,8,3.5,0.6,0.2,0]

plt.bar(list1,list2,width)

plt.show()
T = 0.13
xx = np.arange(-6,6,0.01)
yy = 5*np.sqrt(np.pi)*np.exp(-((xx)**2)/(4*T))*scipy.special.erfc((xx)/(2*np.sqrt(T))) + np.exp(-((xx)**2)/(4*T)) 

plt.plot(xx,yy)

plt.show()

Clearly the fitting here would be pretty hard, but anyway, is there any function or such that allows me to find the best coefficients for the equation: (where T is known)

y = A*np.sqrt(np.pi*D)*np.exp(-((x-E)**2)/(4*D*T))*scipy.special.erfc((x-E)/(2*np.sqrt(D*T))) + 300*np.exp(-((x-E)**2)/(4*D*T)) 

EDIT: This is different from the already asked question and from the scipy documentation example. In the latter the 'xdata' is the same, while in my case it might and might not be. Furthermore I would also be able to plot this curve fitting, which isn't shown on the documentation. The height of the bars is not a function of the x's! So my xdata is not a function of my ydata, this is different from what is in the documentation. To see what I mean try to change the code in the documentation a little bit, to fall into my example, try this:

def func(x,a,b,c):
    return a * np.exp(-b * x) + c
    
xdata = np.linspace(0,4,50)
y = func(xdata, 2.5, 1.3, 0.5)
ydata = [1,6,3,4,6,7,8,5,7,0,9,8,2,3,4,5]

popt, pcov = curve_fit(func,xdata,ydata)

if you run this, it doesn't work. The reason is that I have 16 elements for the ydata and 50 for the function. This happens because y takes values from xdata, while ydata takes values from another set of x values, which is here unknown.

Thank you


Solution

  • I stand by my thinking that this question is a duplicate. Here is a brief example of the typical workflow using curve_fit. Let me know if you still think that your situation is different.

    import numpy as np
    from scipy.optimize import curve_fit
    import matplotlib.pyplot as plt
    
    # bar plot data
    list1 = range(-5, 6)
    list2 = [0, 0.2, 0.6, 3.5, 8, 10,
             8, 3.5, 0.6, 0.2, 0]
    width = 1/1.5
    
    plt.bar(list1, list2, width, alpha=0.75)
    
    # fit bar plot data using curve_fit
    def func(x, a, b, c):
        # a Gaussian distribution
        return a * np.exp(-(x-b)**2/(2*c**2))
    
    popt, pcov = curve_fit(func, list1, list2)
    
    x = np.linspace(-5, 5, 100)
    y = func(x, *popt)
    
    plt.plot(x + width/2, y, c='g')
    

    enter image description here