python scikit-learn pipeline grid-search scoring

Return pipeline score as one of multiple evaluation metrics

I am using a pipeline in a hyperparameter gridsearch in sklearn. I would like the search to return multiple evaluation scores - one a custom scoring function that I wrote, and the other the default score function of the pipeline.

I tried using the parameter scoring={'pipe_score': make_scorer(pipe.score), 'my_score' : my_scoring_func} in my GridSearchCV instance (pipe is the name of my pipeline variable) but this returns nan for pipe_score.

What is a correct way to do this?

Solution

I think you can accomplish this by just using None as the value in the dict:

scoring = {
    'pipe_score': None,
    'my_score': my_scoring_func,
}

To address your attempts a bit more (moving from comments):

Always when debugging NaN scores in a hyperparameter search, set error_score="raise" so that you get the error traceback.

make_scorer takes a metric with signature (y_true, y_pred) and turns it into a scorer with signature (estimator, X_test, y_test). Since pipe.score is already a scorer (with self as the estimator), you don't need this convenience function.

That won't fix things though: pipe.score is the method of the instance pipe, and so in the search you'll be trying to call it with the same instance of pipe, not the fitted versions that are being created during the search: you'll get all the same scores, or an error.