For "convenience", I tried to use sklearn.utils.class_weight, the compute_class_weight function
However, I get the "classes should include all valid labels that can be in y" error; despite that I am 100% sure that I was giving all the class labels that are there.
print(np.unique('y_train'), ' classes in training set')
>>> 86 classes in training set
So this works without problems; taking the len:
print(len(y_train), 'train samples')
>>> 6914 train samples
just to make sure, the shape:
y_train.shape
>>> (6914, 1)
So yes I have a vector of train samples; and I know that four or five classes are totally dominating the rest, so I wanted to add some class weight.
from sklearn.utils.class_weight import compute_class_weight
class_weights = compute_class_weight('balanced', classes = np.unique(y_train), y = y_train)
>>> ValueError: classes should include all valid labels that can be in y
And here I am. What is wrong here?
Solved it. Thanks anyway.
the shape (many, one)
was a problem, with np.ravel()
it was no more a problem.
from sklearn.utils.class_weight import compute_class_weight
class_weights = compute_class_weight('balanced', classes = np.unique(y_train), y = np.ravel(y_train))