Search code examples
kerasdata-scienceimbalanced-data

How to determine the class_weights for imbalanced dataset


I am working on a dataset which imbalanced. I am trying to build the model with the help of class_weights. So on what basis can, I determine the class_weights.

The labels and its count are as below:

label    Count
2        47213
3        2096
4        2021
1        737
0        176

So what values should I give for the class_weight variable:

model.fit(X_train, Y_train, nb_epoch=5, batch_size=32, class_weight=class_weight)


Solution

  • You can employ compute_class_weight from sklearn for that.

    from sklearn.utils import compute_class_weight
    class_weights = compute_class_weight("balanced", np.unique(Y_train), Y_train)