Hi I am currently comparing statistics between Matlab and Apache functions. Here Apache functions are tested in Java. For the very same set of data, I get different results from a double array (double[] ) as follow:
---------------------------------------
Matlab vs Apache
---------------------------------------
max = 0.5451 vs 0.5450980392156862
min = 0.4941 vs 0.49411764705882355
var = 5.4154e-05 vs 5.415357603461868E-5
std = 0.0074 vs 0.007358911334879547
mean = 0.5206 vs 0.5205525290240967
kurtosis = 3.3442 vs 0.35227427833465486
skewness = 0.2643 vs 0.26466432504210746
I checked and rechecked my data, each value from Matlabs is the same used in Java. Here we can see that all statistics are identical, except for the kurtosis.
Is that possible that kurtosis is computed differently from Matlab and Apache library? If so, which data should I trust then?
My data is a subset of an image matrix (containing pixels values). For each subset I compute the above statistics. Everytime, all the statistics match perfectly except for the kurtosis.
The matlab code for computing the kurtosis of my subset is the following:
kurtosis( sub(:) ); // sub is a n x m matrix
While the one I used in Java is:
import org.apache.commons.math3.stat.descriptive.moment.Kurtosis;
// ...
Kurtosis kurt = new Kurtosis();
System.out.println("-kurtosis: " + kurt.evaluate(subImg) );
subImg being a double[n x m] array.
You can calculate the Apache Java statistics in Matlab as well by importing the function. The Apache function uses an unbiased estimator of the population excess kurtosis. Excess kurtosis means substracting 3 so that the kurtosis of a normal distribution is equal to zero.
To demonstrate it I also made a Matlab function out of the function (Apache documentation):
function y = kurtosis_apache(x)
n=length(x);
mean_x = mean(x);
std_x = std(x);
y = ( (n*(n+1) / ((n -1)*(n - 2)*(n-3))) * sum((x - mean_x).^4) / std_x.^4 ) - ((3*(n-1).^2) / ((n-2)*(n-3)));
end
And my code in the command Window that shows the Matlab Apache implementation, the Java Apache implementation, and the Matlab version (biassed/unbiassed):
javaaddpath('commons-math3-3.2.jar')
import org.apache.commons.math3.stat.descriptive.moment.Kurtosis;
x = randn(1e4,1);
kurtosis_apache(x)
ans = 0.0016
kurt = Kurtosis();
kurt.evaluate(x)
ans = 0.0016
kurtosis(x)
ans = 3.0010
kurtosis(x,0)
ans = 3.0016
Note also the Matlab Kurtosis documentation:
So with the 0 flag the unbiassed Matlab implementation is exactly the same as the Apache version, when you substract 3 to make it an excess kurtosis.
(kurtosis(x,0)-3)-kurt.evaluate(x)
ans = 3.8636e-14