I'm new with pyspark, I just saved my LinearSVC model in a folder called "svm.model". I got 2 folders: data and metadata.
Now I'm trying to load the model. This is my code to load the model:
# Spark environment
from pyspark.sql import SparkSession
from pyspark.ml.classification import LinearSVC
spark = SparkSession.builder.getOrCreate()
# read model
lsvc = LinearSVC(maxIter=10, regParam=0.1)
samemodel = lsvc.load("svm.model/")
But when loading the model I get this error:
File "C:/Users/Ayoub/PycharmProjects/sparkdemo/validation.py", line 9, in <module>
samemodel = lsvc.load("svm.model/")
File "E:\spark-3.0.1-bin-hadoop2.7\python\pyspark\ml\util.py", line 330, in load
return cls.read().load(path)
File "E:\spark-3.0.1-bin-hadoop2.7\python\pyspark\ml\util.py", line 280, in load
java_obj = self._jread.load(path)
File "E:\spark-3.0.1-bin-hadoop2.7\python\lib\py4j-0.10.9-src.zip\py4j\java_gateway.py", line 1305, in __call__
answer, self.gateway_client, self.target_id, self.name)
File "E:\spark-3.0.1-bin-hadoop2.7\python\pyspark\sql\utils.py", line 128, in deco
return f(*a, **kw)
File "E:\spark-3.0.1-bin-hadoop2.7\python\lib\py4j-0.10.9-src.zip\py4j\protocol.py", line 328, in get_return_value
format(target_id, ".", name), value)
Py4JJavaError: An error occurred while calling o24.load.
: java.lang.NoSuchMethodException: org.apache.spark.ml.classification.LinearSVCModel.<init>(java.lang.String)
at java.lang.Class.getConstructor0(Unknown Source)
at java.lang.Class.getConstructor(Unknown Source)
at org.apache.spark.ml.util.DefaultParamsReader.load(ReadWrite.scala:468)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:282)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:238)
at java.lang.Thread.run(Unknown Source)
20/11/19 13:22:31 WARN ProcfsMetricsGetter: Exception when trying to compute pagesize, as a result reporting of ProcessTree metrics is stopped
I'm not sure what this means, this is the first time I try to save and load a model with pyspark. I wonder If there's something wrong in my model folder "svm.model" or in my load method ...!?
I was using the wrong class to load the module. The following code works:
from pyspark.ml.classification import LinearSVCModel
samemodel = LinearSVCModel.load(model_path)
So to train the model we use LinearSVC, and to load it we use LinearSVCModel