In spark.mllib library, KMeans has function to set epsilon parameter when building Kmeans instance.
But I did not see any function in Kmeans new Spark.ml library to setup this parameter. The reason I am asking is because the number of cluster the new KMeans generate is less than what I specified in setK() method, so I want to increase the number of clusters generated by decreasing epsilon a bit.
Does anyone know how to setup epsilon in new Spark.ml Kmeans class?
org.apache.spark.ml.clustering.KMeans
Thanks.
Epsilon in the spark.ml
library has been renamed to tol
(short for tolerance)
Example:
KMeans kmeans = new KMeans().setK(2).setSeed(1L).setTol(0.0001)
KMeansModel model = kmeans.fit(dataset);