Search code examples
pythonscipystatisticshypothesis-testkolmogorov-smirnov

Kolmogorov-Smirnov (ks_2samp) p-value not as expected - Wrong test or understanding?


Context

I am using scipy's ks_samp in order to apply the Kolmogorov-Smirnov-test.

The data I use is twofold:

  1. I have a dataset d1 which is an evaluation-metric applied on the forecast of a machine-learning model m1 (namely the MASE - Mean Average Scaled Error). These are around 6.000 data points meaning the MASE-result of 6.000 forecasts using m1.
  2. My second dataset d2 is analogous to d1 with the difference that I used a second model m2, which slightly differs from m1.

The distribution of both datasets looks like:

d1
d1
d2
d2

As can be seen, the distribution looks pretty much alike. I wanted to underline this fact with a Kolmogorov-Smirnov test. However, the results I get applying k2_samp indicate the contrary:

from scipy.stats import ks_2samp

k2_samp(d1, d2)

# Ks_2sampResult(statistic=0.04779414731236298, pvalue=3.8802872942682265e-10)

As I understand, such a pvalue indicates that the distribution is not alike (rejection of H0). But as can be seen on the images it definitely should.

Questions

  1. Am I misunderstand the usage of Kolmogorov-Smirnov and this test is not applicable for the use-case/kind of distribution?
  2. If first can be answered with yes, what alternative do I have?

Edit

Below is the overlay-graph. Concluding from answers and comments on Cross Validated I assume that the divergence in the "middle" might be the cause since KS is sensitive there.
Overlay


Solution

  • I also posted this question on Cross Validated and got helpfull insights and answers there (also note the new edit to the question).

    Kolmogorov-Smirnov (KS) is very sensitive to deviations in the middle. As can be seen in the newly posted overlay-picture in the question, right there is some deviation. Presumably, this is the cause for KS to reject the H0 (= same distribution of df1and df2).

    For a more detailed answer see @BruceETs answer on Cross Validated who deserves the credit for this.