I have these two time series, and I want to test if they come from the same distribution. So I applied the scipy.stats.ks_2samp()
test. But the test returns a p-value of 0.0028
, whereas describe()
gives these statistics:
count 120.000000 120.000000
mean 0.785867 0.774267
std 0.323941 0.304894
min 0.610000 0.610000
25% 0.619000 0.610000
50% 0.619000 0.619000
75% 0.749000 0.769500
max 1.812000 1.742000
So I don't get why the test rejects the null hypothesis, when mean and standard deviation are pretty similar. Also plots of the (cumulative) distributions look very similar.
Can anybody help me?
Here are my data and the test call:
from scipy import stats
df = pd.DataFrame(data=[[
0.62, 0.61, 0.61, 0.619, 0.619, 0.619, 0.62, 0.619, 0.61,
0.619, 0.62, 0.619, 0.619, 0.62, 0.611, 0.62, 0.62, 0.61,
0.619, 0.61, 0.619, 0.62, 0.642, 0.67, 0.749, 0.838, 0.862,
0.804, 0.89, 0.942, 1.012, 1.13, 1.14, 1.191, 1.201, 1.123,
1.299, 1.359, 1.411, 1.362, 1.352, 1.44,1.451, 1.46, 1.557,
1.491, 1.622, 1.639, 1.787, 1.812, 1.665, 1.612, 1.253, 0.936,
0.704, 0.643, 0.62, 0.619, 0.62, 0.61, 0.619, 0.62, 0.619,
0.62, 0.61, 0.619, 0.61, 0.619, 0.62, 0.619, 0.62, 0.62,
0.619, 0.62, 0.62, 0.619, 0.62, 0.619, 0.619, 0.62, 0.619,
0.619, 0.619, 0.619, 0.61, 0.61, 0.619, 0.619, 0.619, 0.62,
0.619, 0.619, 0.619, 0.619, 0.61, 0.619, 0.619, 0.62, 0.619,
0.61, 0.619, 0.619, 0.619, 0.619, 0.61, 0.619, 0.619, 0.62,
0.619, 0.61, 0.619, 0.619, 0.62, 0.619, 0.749, 0.63, 0.62,
0.61, 0.619, 0.619],
[0.801, 0.644, 0.62, 0.62, 0.61, 0.61,
0.619, 0.62, 0.61, 0.61, 0.61, 0.61, 0.619, 0.619, 0.62,
0.61, 0.619, 0.61, 0.619, 0.62, 0.62, 0.629, 0.689, 0.759,
0.849, 0.84, 0.918, 1.019, 0.967, 0.92, 0.976, 1.089, 1.062,
1.219, 1.202, 1.261, 1.387, 1.422, 1.39, 1.264, 1.281, 1.35,
1.32, 1.419, 1.568, 1.554, 1.623, 1.592, 1.709, 1.742, 1.535,
1.123, 0.84, 0.682, 0.63, 0.62, 0.61, 0.61, 0.619, 0.62,
0.61, 0.61, 0.61, 0.61, 0.619, 0.62, 0.61, 0.619, 0.61,
0.62, 0.61, 0.62, 0.61, 0.61, 0.619, 0.62, 0.62, 0.61,
0.61, 0.61, 0.619, 0.62, 0.61, 0.619, 0.62, 0.61, 0.61,
0.61, 0.61, 0.61, 0.619, 0.62, 0.62, 0.61, 0.61, 0.61,
0.619, 0.619, 0.619, 0.61, 0.618, 0.61, 0.61, 0.619, 0.61,
0.61, 0.61, 0.61, 0.619, 0.619, 0.62, 0.61, 0.619, 0.62,
0.62, 0.61, 0.619, 0.61, 0.61, 0.61]]).T
print(stats.ks_2samp(df.iloc[:, 1], df.iloc[:, 0]).pvalue)
The Kolmogorov-Smirnov test did not fail. The seemingly flat tails of the two series really are substantially different from each other. We can see this by zooming in on the tails (starting at index 60) and sorting the values in each series for ease of comparison:
import matplotlib.pyplot as plt
plt.plot(df.iloc[60:, 0].sort_values(ignore_index=True))
plt.plot(df.iloc[60:, 1].sort_values(ignore_index=True), color='orange')
plt.ylim([0.605, 0.625]);
I don't know whether this is an artefact of how the data were recorded, or a real effect. In any case, note that the Kolmogorov-Smirnov test is not appropriate here, because it assumes two random samples, wheras what you have are time series with the time clearly being a significant factor.