Search code examples
pythonscikit-learnclassificationprecision-recall

How are 'positive' and 'negative' determined in Recall formula for machine learning?


In Python scikitlearn there's a function called 'sklearn.metrics.recall_score' (http://scikit-learn.org/stable/modules/generated/sklearn.metrics.recall_score.html), which is calculated by TP/(TP+FN).

In my unsupervised classification model I am classifying emails into 2 classes: 1 or -1. So my question is how does the 'recall_score' function know whether label '1' is 'positive' or label '-1' is 'negative' since I didn't specify which is which? If the model treats '1' as positive, the result for recall will be different from if the model treats '-1' as positive.

Sorry if my question is not clear. Please let me know if you want me to provide any clarification. Thanks!


Solution

  • You need to specify the positive label. For example if you want -1 to be your positive label, you'll call

    recall_score(y_true=[...], y_pred=[...], pos_label=-1)
    

    Note by default 1 is the positive label.