I am using a pipeline in a hyperparameter gridsearch in sklearn. I would like the search to return multiple evaluation scores - one a custom scoring function that I wrote, and the other the default score function of the pipeline.
I tried using the parameter scoring={'pipe_score': make_scorer(pipe.score), 'my_score' : my_scoring_func}
in my GridSearchCV
instance (pipe
is the name of my pipeline variable) but this returns nan
for pipe_score
.
What is a correct way to do this?
I think you can accomplish this by just using None
as the value in the dict:
scoring = {
'pipe_score': None,
'my_score': my_scoring_func,
}
To address your attempts a bit more (moving from comments):
Always when debugging NaN
scores in a hyperparameter search, set error_score="raise"
so that you get the error traceback.
make_scorer
takes a metric with signature (y_true, y_pred)
and turns it into a scorer with signature (estimator, X_test, y_test)
. Since pipe.score
is already a scorer (with self
as the estimator), you don't need this convenience function.
That won't fix things though: pipe.score
is the method of the instance pipe
, and so in the search you'll be trying to call it with the same instance of pipe
, not the fitted versions that are being created during the search: you'll get all the same scores, or an error.