machine-learning classification precision-recall imbalanced-data

imbalance class f1 score meaning

          precision    recall  f1-score   support
Class 0       1.00      0.98      0.99    125000
Class 1       0.33      0.84      0.47      1500

Hi guys,

In this model, the f1 score was not very good for predicting class 1, a minority class.

My thought is, if the model predicts class 0 so well, why don't we just flip the question around and predict class 0. Since there is only 2 class, if it is not class 0, it is class 1.

In other words, if the model can identify a data as class 0, it is definitely not a class 1 (especially when class 0 has a precision of 1). Which means the model still do well.

Does it work this way? why not?

Many thanks in advance.

Solution

You are talking with the intuition that the model really learned class 0. In this case (data imbalance) these scores (high recall/high precision) could be a bit decisive and have less meaning.

Let me give you an example. If you appoint a blind person to classify red apple and orange apple (you've 99 red apples and 1 green apple as your data for example). Now when you gave him a red apple (he doesn't know what color is it) he just randomly says "Red" and you get happy and give him a reward (in ML perspective, produce lower loss). So now he knows saying "Red" would give him a reward, so now he exploits this behavior and says "Red" all the time (though for one "Green" apple he'd miss but that doesn't account for all the rewards he got). Now, If you were to not know that the person was blind you could say that "I can use him as an apple classifier, as he knows about "Red" apple so well, I could just invert his classification when it isn't a "Red" apple". But you know that It's a blind person, he doesn't really know If a "Red" apple is a "Red" apple.

We could think of our model that way, when we give it a task it's job is to reduce the loss, so to do that it'd exploit any and every loophole if it gets one. So when it gets imbalanced data, it knows that always giving a prediction of class 0 (majority class) reduces loss, so that's what it does. If you think about it from the geometric perspective, you've got all these points of 2 colors (different classes), now you've got a line to separate them (decision boundary), if you draw the line far somewhere and say the point to the right (where all the points of dataset live) are class 0 (majority class) and all the points to the left are class 1 (minority class). Then this model as well would produce a high precision score for class 0, which tells us we can really trust the model when it predicts class 0 (that's what precision metric means) but can we really? as we know it didn't learn anything actually.

So these are the problems, with imbalanced data, our cost distribution gets skewed as well which hampers the model to learn rigorously.