Search code examples
python-3.xnumpyscipystatisticskolmogorov-smirnov

Python Compare distribution: SciPy ks_2samp p-value always 0.0


I am trying to compare two distributions and see if they are similar or different. I tried using the ks_2samp from the python scipy package. Here is my code,

from scipy.stats import truncnorm
import matplotlib.pyplot as plt
from scipy import stats

def get_truncated_normal(mean=0, sd=1, low=0, upp=10):
    return truncnorm(low - mean) / sd, (upp - mean) / sd, loc=mean, scale=sd)

x1 = get_truncated_normal(mean=183, sd=50, low=1, upp=365).rvs(5722176)
x2 = get_truncated_normal(mean=175, sd=50, low=1, upp=365).rvs(5722176)
plt.hist(x1)
plt.hist(x2)
plt.show()
print(stats.ks_2samp(x1, x2))

Output:
Ks_2sampResult(statistic=0.06409554686888352, pvalue=0.0)

Why my output p-value is always 0.0?Any help is truly appreciated. Thanks!


Solution

  • Check this stats stackoverflow post. https://stats.stackexchange.com/questions/18408/two-samples-of-the-same-distribution

    which suggests Kolmogorov-Smirnov test.

    And you can perform KS test using scipy.

    https://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.stats.kstest.html