Search code examples
pythonscoringgridsearchcv

Python: scoring = 'recall' in GridSearchCV


I have a binary classification problem:

I try to find the best parameters for my model with

grid = {'penalty': ['l1', 'l2'],'C':[0.001,.009,0.01,.09,1,5,10,25]}
logreg =GridSearchCV(LogisticRegression(),grid,cv=5,scoring = 'recall')
logreg.fit(X, Y)
Y_Pred = logreg.predict(X)

I would like to know what is exactly the parameter scoring = 'recall'. When I add it, it improves a lot my model.


Solution

  • Scoring is basically how the model is being evaluated. Scikit supports quite a lot, you can see the full available scorers here.

    Having high recall means that your model has high true positives and less false negatives. It means that there are more actual positives values being predicted as true and less actual positive values being predicted as false. You may also like to read more about confusion matrix.

    As to what kind of scoring should you use, that depends on what you are trying to achieve with your model.