python keras scikit-learn neural-network sequential

Error: Supported target types are: ('binary', 'multiclass')

How to handle the error ValueError: Supported target types are: ('binary', 'multiclass'). Got 'continuous-multioutput' instead ?

I tried something with from sklearn.utils.multiclass import type_of_target or x[0],y[0], but without success ...

Vizualization of X:

Vizualization of Y:

X.shape, Y.shape

((336, 10), (336, 5))

Deep learning model:

for train, test in kfold.split(X, Y):

    model = Sequential()
    model.add(Dense(10, input_dim=20, 
                kernel_regularizer=l2(0.001),
                kernel_initializer=VarianceScaling(), 
                activation='sigmoid'))
    model.add(Dense(5, 
                kernel_regularizer=l2(0.01),
                kernel_initializer=VarianceScaling(),                 
                activation='sigmoid'))
    
    model.compile(loss='binary_crossentropy', optimizer='adam', 
              metrics=['acc'])
    
    model.fit(X[train], Y[train], epochs=50, batch_size=25, verbose = 0,
              validation_data=(X[test], Y[test]))

    scores = model.evaluate(X[test], Y[test], verbose=0)
    print("%s: %.2f%%" % (model.metrics_names[2], scores[2]*100))
    cvscores.append(scores[2] * 100)

---------------------------------------------------------------------------
ValueError: Supported target types are: ('binary', 'multiclass'). Got 'continuous-multioutput' instead.

Solution

StratifiedKFold is not meant to be used for multilabel targets as already pointed out here. It needs a 1D-array to determine how to split the indices.

I suppose you want to split your target based on the label with the highest probability. One way to achieve this goal would be to create a 1D-array indicating the target with the highest probability and pass this one to StratifiedKFold instead of the multilabel target.

Let's say you have your sample data in a pandas DataFrame y and it looks like this:

       0      1    2    3    4
0  0.966  0.000  0.0  0.2  0.0
1  0.966  0.000  0.0  0.0  0.2
2  0.000  0.966  0.5  0.0  0.0
3  0.000  0.966  0.0  0.0  0.0
4  0.966  0.000  0.0  0.0  0.0

Then, create a new object with idxmax to find the target with highest probability:

y_max = y.idxmax(axis=1)

This gives you an output like this:

0    0
1    0
2    1
3    1
4    0
dtype: int64

Now you can pass this array to StratifiedKFold and obtain the indices you need:

for train, test in kfold.split(X, y_max):
    ...

    model.fit(X[train], Y[train], epochs=50, batch_size=25, verbose = 0,
              validation_data=(X[test], Y[test]))

    scores = model.evaluate(X[test], Y[test], verbose=0)
    print("%s: %.2f%%" % (model.metrics_names[2], scores[2]*100))
    cvscores.append(scores[2] * 100)

This way, you can obtain the indices from a 1D-array and still use the original data for training and testing. If your data happens to be in a numpy array, the same can be achieved with numpy's argmax function.