I'm using ks.test
function in r to perform Kolmogorov-Smirnov test. I found that Kolmogorov-Smirnov test gives result different from
max(abs(difference(x, y)))
According to the definition of Kolmogorov-Smirnov Test in Wikipedia, the results should be equivalent.
Does any one know why?
The KS statistic is not supposed to be equal to max(|x-y|)
.
It is applied to the cumulative distribution function(s) (CDF). Thus, it represents rather the proportion of observations different between a sample and a reference distribution.
See the two examples below executed in MATLAB (although I expect the results to be identical in R):
x = [1 2 3 4 5 6 7 8 9 10];
y = [1 2 3 4 5 6 7 8 9 11];
[~, ~, ks2s] = kstest2(x,y)
ks2s =
0.1000 (1)
x = [1 2 3 4 5 6 7 8 9 10];
y = [1 2 3 4 5 6 7 8 9 12];
[~, ~, ks2s] = kstest2(x,y)
ks2s =
0.1000 (2)
Thus, although the maximum absolute magnitude difference between x
and y
is larger in (2), the KS statistic is the same because the proportion of samples that are different is the same.
If y
has an extra sample, for example, the result changes:
x = [1 2 3 4 5 6 7 8 9 10];
y = [1 2 3 4 5 6 7 8 9 10 11];
[h, p, ks2s] = kstest2(x,y)
ks2s =
0.0909