Search code examples
scalastatisticsbreezescala-breeze

How to fit data to normal distribution using scala breeze


I am trying to fit data to normal distribution using scala breeze , python scipy alternative way is :

from scipy.stats import norm

mu,std = norm.fit(time1)

I am looking for alternative way to do same in scala using breeze


Solution

  • Looking at the source code for norm.fit, it looks like if you use the function with only the data passed in (ie no other parameters), then this function just returns the mean and standard deviation:. We can accomplish the same in Breeze like so:

    scala> val data = DenseVector(1d,2d,3d,4d)
    data: breeze.linalg.DenseVector[Double] = DenseVector(1.0, 2.0, 3.0, 4.0)
    
    scala> val mu = mean(data)
    mu: Double = 2.5
    
    scala> val samp_var = variance(data)
    samp_var: Double = 1.6666666666666667
    
    scala> val n = data.length.toDouble
    n: Double = 4.0
    
    scala> val pop_var = samp_var * (n-1)/(n)
    pop_var: Double = 1.25
    
    scala> val pop_std = math.sqrt(pop_var)
    pop_std: Double = 1.118033988749895
    

    We need to modify the sample variance to get the population variance. This is the same as the scipy result:

    In [1]: from scipy.stats import norm
    
    In [2]: mu, std = norm.fit([1,2,3,4])
    
    In [3]: mu
    Out[3]: 2.5
    
    In [4]: std
    Out[4]: 1.1180339887498949