I'm trying to fit an histogram with a poisson function. I'm following some answers in the site but I can't solve the problem. I simple want to fit the uplims of the histogram's bars below, with a poisson function, using python 3.6.
I'm trying this:
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
from scipy.special import factorial
def poisson(k, lamb):
return (lamb**k/factorial(k))*np.exp(-lamb)
fig, ax = plt.subplots()
ax.hist(magz,bins=10,alpha=0.3)
y,x=np.histogram(magz,bins=10)
x = x + (x[1]-x[0])/2
x = np.delete(x,-1)
parameters, cov_matrix = curve_fit(poisson, x, y)
x_new = np.linspace(x[1], x[-1], 50)
ax.plot(x_new, poisson(x_new, *parameters), color='b')
The blue fitting line is on the bottom of the graph, and it seems doesn't works.
the magz
values are:
magz = [ 24.638505, 20.446914, 22.10271, 21.227533, 21.761152, 18.923867,
24.054868, 23.92457, 21.515022, 21.835458, 21.204597, 21.848573,
24.036382, 21.126777, 21.599414, 20.044833, 24.390594, 23.772577,
19.608918, 22.676774, 23.6312, 24.12077, 21.22321, 20.350204,
20.548614, 22.650914, 20.561528, 24.892959, 22.49959, 22.94469,
24.346355, 23.934491, 22.448417, 20.535562, 20.785362, 25.131568,
24.462043, 24.173652, 19.105512, 20.641586, 19.5268, 25.0747, 23.254556,
24.460447, 24.37759, 22.708406, 20.765025, 17.27031, 22.723192, 22.4452,
23.366233, 23.238118, 20.72437, 22.278445, 22.231206, 20.98687,
20.341756, 22.968032, 22.504077, 21.277349, 19.163532, 22.44034,
22.406581, 21.999019, 23.313155, 17.945062, 23.027715, 23.640596,
20.60174, 20.124156, 22.39343, 23.786641, 25.201003, 25.227441,
22.537135, 23.779794, 19.416995, 23.550315, 23.225908, 22.579364,
21.69014, 20.50741, 22.543212, 21.640962, 20.425422, 23.793624,
20.554366, 23.522594, 21.03864, 18.282573, 17.33285, 21.015233,
21.611837, 24.00014, 22.741971, 24.416913, 24.053067, 24.314742,
22.400808, 19.622712, 24.911735, 20.04981, 19.128503, 23.83536,
21.864501, 25.037927, 24.449472, 24.200244, 24.492582, 22.378057,
25.411336, 21.727248, 20.16391, 22.045026, 20.017703, 18.534318,
21.995092, 21.434434, 21.825042, 22.78943, 19.006103, 22.398988,
24.198657, 25.018417, 18.829948, 23.94818, 20.813966]
I believe there are two separate issues here.
First, when using curve_fit()
, you really need to think about and give initial values for the parameters. Since you don't give an initial value for lamb
, scipy's curve_fit
makes the absolutely unjustifiable assumption that the value is 1.0. Let me be clear: it is a serious flaw in curve_fit()
that it allows you to not give a starting value and silently gives a value for you. This flaw trips up many users.
The trouble here is that the starting value of lambda=1 gives no intensity out at k > 10, where all of your samples are. Thus when making small changes to the value of lambda, the fit sees no improvement in the fit, and so has no way of exploring values and finding a better solution. You can test this by printing the values that curve_fit()
sends to your poisson()
function. You would definitely want a starting value around 10 or so. It doesn't need to be perfect, but it cannot be so far off that the model function gives no intensity at all.
Second, your values do not actually follow a Poisson distribution. It appears that they may be proportional to a Poisson distribution, but you would then need to include a scaling factor
In short, I think you'll want something like
def poisson(k, lamb, scale):
return scale*(lamb**k/factorial(k))*np.exp(-lamb)
and then call curve_fit()
with decent starting values for both lamb
and scale
:
parameters, cov_matrix = curve_fit(poisson, x, y, p0=[10., 10.])
For me, that still does not give a great fit, but at least it gives something reasonable. Hope that helps.