I am using MLP classifier from pyspark.ml.classification. I am fitting my MLP model to the dataset using crossvalidation i.e; ParamGrid method. I am using ParamGrid method to iterate over several hyperparameters. After that I am using Crossvalidation class for training and to get best hyperparameters. After training when I am trying to access the best hyperparameter from crossvalidation object I am getting an error. Could anyone tell me how to get the best hyperparameters?
from pyspark.ml.classification import MultilayerPerceptronClassifier
layers = [4, 5, 4, 3]
clf = MultilayerPerceptronClassifier(labelCol='label',layers=layers)
pipeline = Pipeline(stages=[clf])
x1 = 'stepSize'
x2 = 'maxIter'
paramGrid = ParamGridBuilder() \
.addGrid(getattr(clf,x1), [0.1, 0.2]) \
evaluator = MulticlassClassificationEvaluator(labelCol='label',
predictionCol='prediction', metricName='f1')
crossval = CrossValidator(estimator=pipeline,
cvModel = crossval.fit(train_data)
Py4JError: An error occurred while calling o1127.getMaxIter. Trace:
py4j.Py4JException: Method getMaxIter([]) does not exist
at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:318)
at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:326)
at py4j.Gateway.invoke(Gateway.java:274)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:238)
at java.lang.Thread.run(Thread.java:748)
This cvModel.bestModel.stages[0]._java_obj.getMaxIter() is working When I am using logistic regression or random forest classifiers. I am getting the error only when I am using MLP classifier. Is there any method to get the best hyperparameters when we use MLP classifier?
I was getting the same error running exactly the same code and the following line from the following post solved this problem for me.
So the part you're missing is the "parent()" call, you need the "parent()" call. Hope this helps!