How to handle the error ValueError: Supported target types are: ('binary', 'multiclass'). Got 'continuous-multioutput' instead
?
I tried something with from sklearn.utils.multiclass import type_of_target
or x[0],y[0]
, but without success ...
X.shape, Y.shape
((336, 10), (336, 5))
for train, test in kfold.split(X, Y):
model = Sequential()
model.add(Dense(10, input_dim=20,
kernel_regularizer=l2(0.001),
kernel_initializer=VarianceScaling(),
activation='sigmoid'))
model.add(Dense(5,
kernel_regularizer=l2(0.01),
kernel_initializer=VarianceScaling(),
activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam',
metrics=['acc'])
model.fit(X[train], Y[train], epochs=50, batch_size=25, verbose = 0,
validation_data=(X[test], Y[test]))
scores = model.evaluate(X[test], Y[test], verbose=0)
print("%s: %.2f%%" % (model.metrics_names[2], scores[2]*100))
cvscores.append(scores[2] * 100)
---------------------------------------------------------------------------
ValueError: Supported target types are: ('binary', 'multiclass'). Got 'continuous-multioutput' instead.
StratifiedKFold
is not meant to be used for multilabel targets as already pointed out here. It needs a 1D-array to determine how to split the indices.
I suppose you want to split your target based on the label with the highest probability. One way to achieve this goal would be to create a 1D-array indicating the target with the highest probability and pass this one to StratifiedKFold
instead of the multilabel target.
Let's say you have your sample data in a pandas DataFrame y
and it looks like this:
0 1 2 3 4
0 0.966 0.000 0.0 0.2 0.0
1 0.966 0.000 0.0 0.0 0.2
2 0.000 0.966 0.5 0.0 0.0
3 0.000 0.966 0.0 0.0 0.0
4 0.966 0.000 0.0 0.0 0.0
Then, create a new object with idxmax
to find the target with highest probability:
y_max = y.idxmax(axis=1)
This gives you an output like this:
0 0
1 0
2 1
3 1
4 0
dtype: int64
Now you can pass this array to StratifiedKFold
and obtain the indices you need:
for train, test in kfold.split(X, y_max):
...
model.fit(X[train], Y[train], epochs=50, batch_size=25, verbose = 0,
validation_data=(X[test], Y[test]))
scores = model.evaluate(X[test], Y[test], verbose=0)
print("%s: %.2f%%" % (model.metrics_names[2], scores[2]*100))
cvscores.append(scores[2] * 100)
This way, you can obtain the indices from a 1D-array and still use the original data for training and testing. If your data happens to be in a numpy array, the same can be achieved with numpy's argmax
function.