meaning of weighted metrics in scikit: bigger class more weight or smaller class more weight?

I am dealing with an imbalanced dataset and tried handle it with the validation metric. In scikit docu I found the following for weighted:

Calculate metrics for each label, and find their average weighted by support (the number of true instances for each label). This alters ‘macro’ to account for label imbalance; it can result in an F-score that is not between precision and recall.

Does calculating the average weighted by support mean, that the class with more samples are higher weighted than the ones with less samples or, as it seems to be more logical, that smaller classes are weighted more than bigger ones.

I couldn't find anything out it in the docu and wanted to make sure I am choosing the right metric.

Thanks!

Solution

Short answer: weighted by support means the higher the support the higher the weight. This translates to the more samples a certain class has the higher its score is weighted.

That being said, please be aware that you have not "handled" the class imbalance by just choosing another calculation method for your metric. I believe they are meant to offer you another perspective of your model's performance.

Typically, models perform much better on the majority class. Using a weighted metric will overemphasize this. But the model still has the same, probably quite poor, performance on the minority class(es). If they happen to be the important one(s), you might end up just fooling yourself.