My code use apache.ml.clustering.GaussianMixture, but its init method private def initRandom(...)
does not work well, so I want to customize a new init
method.
At first I want to "extends" class GuassianMixture
, but initRandom
is a private method.
Then I tried another way, it is to set initial GMM, but sadly source code says that TODO: SPARK-15785 Support users supplied initial GMM.
I've also tried to copy the code of class GuassianMixture
for my custom class, but there are too many things attached to it. GaussianMixture.scala comes with sort of classes and traits, some of which are only accessible within ML packages.
I solved it by myself. Here is my solution.
I created class CustomGaussianMixture
which extends GaussianMixture
from official package org.apache.spark.ml.clustering
.
And within my project, I created a new package, also named as org.apache.spark.ml.clustering
(to prevent deal with scope of sort of complexity classes/traits/objects in org.apache.spark.ml.clustering
). And place my custom class in it.
The next thing is to override the method(fit
) call initRandom
, a non-private method, so I can override it. Specifically, Just write my new init method in class CustomGaussianMixture
, and copy method fit
from official source code in GaussianMixture.scala
to class CustomGaussianMixture
, remember to modify code in CustomGaussianMixture.fit()
to call my custom init method.
At last, just use CustomGaussianMixture
instead of GaussianMixture
when needed.