fitting location parameter in the gamma distribution with scipy

Would somebody be able to explain to me how to use the location parameter with the gamma.fit function in Scipy?

It seems to me that a location parameter (μ) changes the support of the distribution from x ≥ 0 to y = ( x - μ ) ≥ 0. If μ is positive then aren't we losing all the data which doesn't satisfy x - μ ≥ 0?

Thanks!

Solution

The fit function takes all of the data into consideration when finding a fit. Adding noise to your data will alter the fit parameters and can give a distribution that does not represent the data very well. So we have to be a bit clever when we are using fit.

Below is some code that generates data, y1, with loc=2 and scale=1 using numpy. It also adds noise to the data over the range 0 to 10 to create y2. Fitting y1 yield excellent results, but attempting to fit the noisy y2 is problematic. The noise we added smears out the distribution. However, we can also hold 1 or more parameters constant when fitting the data. In this case we pass floc=2 to the fit, which forces the location to be held at 2 when performing the fit, returning much better results.

from scipy.stats import gamma
import numpy as np
import matplotlib.pyplot as plt

x = np.arange(0,10,.1)
y1 = np.random.gamma(shape=1, scale=1, size=1000) + 2  # sets loc = 2 
y2 = np.hstack((y1, 10*np.random.rand(100)))  # add noise from 0 to 10

# fit the distributions, get the PDF distribution using the parameters
shape1, loc1, scale1 = gamma.fit(y1)
g1 = gamma.pdf(x=x, a=shape1, loc=loc1, scale=scale1)

shape2, loc2, scale2 = gamma.fit(y2)
g2 = gamma.pdf(x=x, a=shape2, loc=loc2, scale=scale2)

# again fit the distribution, but force loc=2
shape3, loc3, scale3 = gamma.fit(y2, floc=2)
g3 = gamma.pdf(x=x, a=shape3, loc=loc3, scale=scale3)

And make some plots...

# plot the distributions and fits.  to lazy to do iteration today
fig, axes = plt.subplots(1, 3, figsize=(13,4))
ax = axes[0]
ax.hist(y1, bins=40, normed=True);
ax.plot(x, g1, 'r-', linewidth=6, alpha=.6)
ax.annotate(s='shape = %.3f\nloc = %.3f\nscale = %.3f' %(shape1, loc1, scale1), xy=(6,.2))
ax.set_title('gamma fit')

ax = axes[1]
ax.hist(y2, bins=40, normed=True);
ax.plot(x, g2, 'r-', linewidth=6, alpha=.6)
ax.annotate(s='shape = %.3f\nloc = %.3f\nscale = %.3f' %(shape2, loc2, scale2), xy=(6,.2))
ax.set_title('gamma fit with noise')

ax = axes[2]
ax.hist(y2, bins=40, normed=True);
ax.plot(x, g3, 'r-', linewidth=6, alpha=.6)
ax.annotate(s='shape = %.3f\nloc = %.3f\nscale = %.3f' %(shape3, loc3, scale3), xy=(6,.2))
ax.set_title('gamma fit w/ noise, location forced')