Search code examples
scalaapache-sparkapache-spark-sqlapache-spark-mllib

Scala Spark load saved MLlib model


I have saved a Random Forest model in my S3 path and now i want to load it. However, i got an error that method does not exist.

code (saving model works):

import org.apache.spark.ml.classification.RandomForestClassifier

    val rfClassifier = new RandomForestClassifier()
      .setImpurity("gini")
      .setMaxDepth(8)
      .setNumTrees(200)
      .setFeatureSubsetStrategy("auto")
      .setSeed(18)

   val rfModel = rfClassifier.fit(trainingFeatures)
    rfModel
    .write
    .overwrite()
    .save(<MY S3 PATH>)

Code (loading model doesnt work):

val rfmodel = RandomForestClassifier.load(<MY S3 PATH>)
)

Error:

java.lang.NoSuchMethodException: org.apache.spark.ml.classification.RandomForestClassificationModel.<init>(java.lang.String)

Not sure why this error is occurring when the load method exist


Solution

  • You should load RandomForestClassificationModel not RandomForestClassifier.

    Replace with:

    val rfmodel = RandomForestClassificationModel.load(<MY S3 PATH>)
    

    More about model persistence here