python numpy scipy statistics distribution

Maxwellian Distribution in Python Scipy

In the article that I am interested in, it states that the data is well represented with a Maxwellian distribution and it also provides a Mean speed (307 km/s) and 1 sigma uncertainty (47 km/s) for the distribution.

Using the provided values, I have attempted to re-generate the data and then fit it with the Maxwellian distribution using the python scipy.stats.

As it described in here, maxwell function in scipy takes two input, 1) "loc" which shifts the x variable and 2) "a" parameter which corresponds to the parameter "a" in the maxwell-Boltzmann equation.

In my case, I have neither of these parameters, so using the Mean and variance (sigma^2) description in wiki page, I have attempted to calculate the "a" and "loc" parameter. Both mean and sigma parameters are only dependent on "a" parameter.

The first problem I have encountered was the "a" parameter that I get from Mean (a = 192.4) and sigma (a = 69.8) are different from each other. The second problem is that I don't know how can I obtain the exact loc (shift) value from Mean and sigma.

Based on the shape of the distribution (where mean speed values fall in the graph, check figure 2), I tried to guess the "loc" value and together with the "a" value obtained from sigma (a = 69.8), I have generated and fitted the data. Approximately it seems correct, but I don't know the answer to the questions I mentioned above and I need some expert's guidance on this. I appreciate any help.

import matplotlib.pyplot as plt
import math
from scipy.stats import norm
import random
import numpy as np
import scipy.optimize
from scipy.stats import maxwell

samplesize = 100000

mean = 307
sigma = 47
loc = 175 #my guess
a_value = np.sqrt((sigma**2 * math.pi)/(3*math.pi - 8)) #calculated based on wiki description

fig, axs = plt.subplots(1)
v_2d = maxwell.rvs(loc, a_value, size=samplesize) #array corresponding to 2D proper motion obtained from Hubbs
mean, var, skew, kurt = maxwell.stats(moments='mvsk')

N, bins, patches = plt.hist(v_2d, bins=100, density=True, alpha=0.5, histtype='bar', ec='black')
maxx = np.linspace(min(v_2d), max(v_2d), samplesize)

axs.plot(maxx, maxwell.pdf(maxx, loc, a_value), color=colorset[6], lw=2, label= r'$\mathdefault{\mu}$ = '+'{:0.1f}'.format(mean)+r' , '+r'$\mathdefault{\sigma}$ = '+'{:0.1f}'.format(sigma))

axs.set(xlabel=r'2-D Maxwellian speed (km s$^{-1}$)')
axs.set(ylabel='Frequency')
plt.legend(loc='upper right')

Solution

Well, mean value is affected by location, and sigma won't. So compute a from sigma, compute mean as if loc=0, find the difference and assign it to location, sample 100K RVs to check if sampled mean/stddev are close enough.

Code, Python 3.8, Windows 10 x64

import numpy as np

from scipy.stats import maxwell

σ = 47
μ = 307

a = σ * np.sqrt(np.pi/(3.0*np.pi - 8.0))
print(a)

m = 2.0*a*np.sqrt(2.0/np.pi)
print(m) # as if loc=0

loc = μ - m
print(loc)

print("----------Now test--------------------")

# sampling
q = maxwell.rvs(loc=loc, scale=a, size=100000)

print(np.mean(q))
print(np.std(q))

as output I've got

306.9022249667151
47.05319429681308

Good enough?