Search code examples
pythonkerasscikit-learnneural-networksequential

Error: Supported target types are: ('binary', 'multiclass')


How to handle the error ValueError: Supported target types are: ('binary', 'multiclass'). Got 'continuous-multioutput' instead ?

I tried something with from sklearn.utils.multiclass import type_of_target or x[0],y[0], but without success ...

Vizualization of X:

enter image description here

Vizualization of Y:

enter image description here

X.shape, Y.shape

((336, 10), (336, 5))

Deep learning model:

for train, test in kfold.split(X, Y):

    model = Sequential()
    model.add(Dense(10, input_dim=20, 
                kernel_regularizer=l2(0.001),
                kernel_initializer=VarianceScaling(), 
                activation='sigmoid'))
    model.add(Dense(5, 
                kernel_regularizer=l2(0.01),
                kernel_initializer=VarianceScaling(),                 
                activation='sigmoid'))
    
    model.compile(loss='binary_crossentropy', optimizer='adam', 
              metrics=['acc'])
    
    model.fit(X[train], Y[train], epochs=50, batch_size=25, verbose = 0,
              validation_data=(X[test], Y[test]))

    scores = model.evaluate(X[test], Y[test], verbose=0)
    print("%s: %.2f%%" % (model.metrics_names[2], scores[2]*100))
    cvscores.append(scores[2] * 100)
---------------------------------------------------------------------------
ValueError: Supported target types are: ('binary', 'multiclass'). Got 'continuous-multioutput' instead.

Solution

  • StratifiedKFold is not meant to be used for multilabel targets as already pointed out here. It needs a 1D-array to determine how to split the indices.

    I suppose you want to split your target based on the label with the highest probability. One way to achieve this goal would be to create a 1D-array indicating the target with the highest probability and pass this one to StratifiedKFold instead of the multilabel target.

    Let's say you have your sample data in a pandas DataFrame y and it looks like this:

           0      1    2    3    4
    0  0.966  0.000  0.0  0.2  0.0
    1  0.966  0.000  0.0  0.0  0.2
    2  0.000  0.966  0.5  0.0  0.0
    3  0.000  0.966  0.0  0.0  0.0
    4  0.966  0.000  0.0  0.0  0.0
    

    Then, create a new object with idxmax to find the target with highest probability:

    y_max = y.idxmax(axis=1)
    

    This gives you an output like this:

    0    0
    1    0
    2    1
    3    1
    4    0
    dtype: int64
    

    Now you can pass this array to StratifiedKFold and obtain the indices you need:

    for train, test in kfold.split(X, y_max):
        ...
    
        model.fit(X[train], Y[train], epochs=50, batch_size=25, verbose = 0,
                  validation_data=(X[test], Y[test]))
    
        scores = model.evaluate(X[test], Y[test], verbose=0)
        print("%s: %.2f%%" % (model.metrics_names[2], scores[2]*100))
        cvscores.append(scores[2] * 100)
    

    This way, you can obtain the indices from a 1D-array and still use the original data for training and testing. If your data happens to be in a numpy array, the same can be achieved with numpy's argmax function.