I'm currently working on migrating some python code to scala. I'm using breeze lib as a substitution for numpy.
Everything looks fine, but I faced different behaviour in output of standard deviation implementations:
Python:
series = np.array([1,4,5])
np.mean(series) // 3.3333333333333335
np.std(series) // 1.699673171197595
Scala:
val vector = breeze.linalg.Vector[Double](Array(1.0, 4.0, 5.0))
val mean = breeze.stats.mean(vector) // 3.3333333333333335
val std = breeze.stats.stddev(vector) // 2.081665999466133
I know how to reproduce python's behaviour in plain scala. Sample code is presented here: Scala: What is the generic way to calculate standard deviation
But I'm looking for a way to get it with breeze. Any ideas?
This is related to the number of degrees of freedom. Indeed,
>>> np.std(series, ddof=1)
2.081665999466133
Which is the sample std. With breeze, something you can do to get the population std is
var n = 3
val std = breeze.stats.stddev(vector)*Math.pow((n-1)/n, .5)
# 1.6996731711975948