Search code examples
pythonnumpyscipystatisticspymc3

Parameters for fitted distribution


When searching for the best-fit distribution for my dataset, the result was the Exponentially Modified Normal distribution with the following parameters:

K=10.84, loc=154.35, scale=73.82 

Scipy gives us a way to analyze the mean of the distribution by:

  fitted_mean =  scipy.stats.exponnorm.stats(K=10.84, loc=154.35, scale=73.82, moments='mean') 

The resulting fitted_mean=984, which is the same mean as my dataset. However, I'm not sure what this is telling me. I thought the loc=154.35 is the mean of the distribution.

What are these two means? If I fitted the data with the best distribution, isn't the fitted_mean (154.35) the new and only mean?


Solution

  • For the exponentially modified normal distribution, the location parameter is not the same as the mean. This is true for many distributions.

    Take a look at the wikipedia page for the exponentially modified Gaussian distribution. This is the same distribution as scipy.stats.exponnorm, but with a different parameterization. The mapping of the parameters between the wikipedia version and scipy is:

    μ = loc
    σ = scale
    λ = 1/(K*scale)
    

    The wikipedia page says the mean of the distribution is μ + 1/λ, which, in terms of the scipy parameters, is loc + K*scale.

    When you fit the distribution to your data, you found

    loc = 154.35
    scale = 73.82 
    K = 10.84
    

    The formula for the mean from the wikipedia page gives

    loc + K*scale = 954.5587999999999
    

    Here is the calculation using exponnorm:

    In [16]: fitted_mean = scipy.stats.exponnorm.stats(K=10.84, loc=154.35, scale=73.82, moments='mean')
    
    In [17]: fitted_mean
    Out[17]: array(954.5587999999999)
    

    which matches the result from the wikipedia formula.

    (You reported fitted_mean = 984, but I assume that was a typographical error.)