I'm having problems to fit my classifier using binarized labels.
clf_linear = GridSearchCV(SVC(kernel='linear', class_weight='balanced'),
param_grid, cv=5)
clf_linear = clf_linear.fit(X_train_pca, y_train)
y_train was binarized by the following method:
y_train = label_binarize(y_train, classes=[1, 2, 3])
I got the following error:
File "C:\Python\lib\site-packages\sklearn\utils\validation.py", line 788, in column_or_1d raise ValueError("bad input shape {0}".format(shape)) ValueError: bad input shape (545, 3)
The input label shape is (682, 3) not (545, 3).
My professor told me to use binarized labels in gridSearchCV, but reading scikit-learn docs I think I can't do this.
Doesn't matter its 682,3 or 545,3. Why the target has 3 columns? Your y (targets) should be 1-d array for SVC. You dont need to do the label_binarize
operation. Keep y_train
as it is.
Doing this:
y_train = label_binarize(y_train, classes=[1, 2, 3])
Will convert the y_train to label-indicator
matrix. That is used for multi-label classification problems (where the sample can have more than one class at a time). Its not used for multi-class problems.
Keep the y_train
as it is to keep it as one-dimensional array and SVC
will handle the rest.