Search code examples
machine-learningprecision-recall

F score when negative class is rare


I have dataset where 20% of data is negative class, and 80% is positive. When calculating F score I assume that precision is TP/(TP+FP). Should I 'inverse' formula because my less frequent class is negative? So it would be TN/(TN+FN)?


Solution

  • First of all, what you have written is not the F1-score. That is Precision!

    To compute the F1-score, set precision=TP/(TP+FP) and recall=TP/(TP+FN). Their harmonic mean is the F1-score. So, F1=2*(P*R)/(P+R). See this for further details.

    You can compute these values for each class and see how well you are doing in your classification task.If you want to compute it for the negative class, what you will end up is as you said computing true negative examples instead of the true positive examples. Note that true positive simply means correctly classified for the class of interest. It has nothing to do with the class value.

    Finally, you can also compute the precision, recall, and f1 for both classes and take their average. It all ends up with how you want to judge your classifier performance. If it is more important to accurately classify the negative instances you should focus on getting high precision for the negative one (of course not screwing up the other class!) The same goes for recall.