I am doing cross validation on the dataset for some set of hyperparameters.
lr = LogisticRegression()
paramGrid = ParamGridBuilder() \
.addGrid(lr.regParam, [0, 0.01, 0.05, 0.1, 0.5, 1]) \
.addGrid(lr.elasticNetParam, [0.0, 0.1, 0.5, 0.8, 1]) \
evaluator = BinaryClassificationEvaluator()
cv = CrossValidator(estimator=lr, estimatorParamMaps=grid, evaluator=evaluator)
I want to know the best value for regParam and elasticNetParam. In python we have an option to get the best parameters after cross-validation. Is there any method in pyspark to get the best values for parameters after cross-validation?
For example : regParam - 0.05
elasticNetParam - 0.1
Well, you have to fit your CrossValidator first:
cv_model = cv.fit(train_data)
After you do that, you will have a best_model in:
best_model = cv_model.bestModel
To extract the parameters, you will have to do this ugly thing:
best_reg_param = best_model._java_obj.getRegParam()
best_elasticnet_param = best_model._java_obj.getElasticNetParam()