Search code examples
pythonkerasscikit-learnclassificationcross-validation

Keras does not accept the target label due to its shape?


I am trying to get Keras working with a classification problem, which has five categorical target labels (1, 2, 3, 4, 5). For some reason I am unable to get it working, while using StratifiedKFold. X and y are NumPy arrays with shapes (500, 20) and (500, ), respectively.

The error message is "ValueError: Error when checking target: expected dense_35 to have shape (1,) but got array with shape (5,)", which leads me to think that the error definitely lies in the format of the target variable. It is also notable, that the number in "dense_35" seems to be varying for each attempt of trying to run the code.

random_state = 123
n_splits = 10
cv = StratifiedKFold(n_splits=n_splits, 
random_state=random_state, shuffle=False)

def baseline_model():
    nn_model = Sequential()
    nn_model.add(Dense(units=50, input_dim=X.shape[1], init='normal',
                       activation= 'relu' ))
    nn_model.add(Dense(30, init='normal', activation='relu'))
    nn_model.add(Dense(10, init='normal', activation='relu'))
    nn_model.add(Dense(1, init='normal', activation='softmax'))

    nn_model.compile(optimizer='adam', loss='categorical_crossentropy',
                     metrics = ['accuracy'])
    return nn_model

for train, test in cv.split(X, y):   
    X_train, X_test = X[train], X[test]
    y_train, y_test = y[train], y[test]

    np_utils.to_categorical(y_train)
    np_utils.to_categorical(y_test)

    estimator = KerasClassifier(build_fn=baseline_model,
                                epochs=200, batch_size=5,
                                verbose=0)

    estimator.fit(X_train, y_train)
    y_pred = estimator.predict(X_test)

The numpy array (y), that I am trying to split:
[1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2
 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
 3 3 3 3 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 5 5 5 5 5 5 5
 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5
 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5
 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5]

Solution

  • random_state = 123
    n_splits = 10
    cv = StratifiedKFold(n_splits=n_splits, random_state=random_state, shuffle=False)
    def baseline_model():
        nn_model = Sequential(name='model_name')
        nn_model.add(Dense(units=50, input_dim=X.shape[1], init='normal',
                           activation= 'relu', name='dense1'))
        nn_model.add(Dense(30, init='normal', activation='relu', name='dense2'))
        nn_model.add(Dense(10, init='normal', activation='relu', name='dense3'))
        # code changed here
        nn_model.add(Dense(5, init='normal', activation='softmax', name='dense4'))
    
        nn_model.compile(optimizer='adam', loss='categorical_crossentropy',
                         metrics = ['accuracy'])
        return nn_model
    
    for train, test in cv.split(X, y):   
        X_train, X_test = X[train], X[test]
        y_train, y_test = y[train], y[test]
    
        # the error is due to this step
        # you have specified only one output in the last dense layer (dense4)
        # but you are giving input of length 5
        np_utils.to_categorical(y_train)
        np_utils.to_categorical(y_test)
    
        estimator = KerasClassifier(build_fn=baseline_model,
                                    epochs=200, batch_size=5,
                                    verbose=0)
        estimator.fit(X_train, y_train)
        y_pred = estimator.predict(X_test)
    
    • By specifying the name parameters in the layer you can name your layers. By doing so you will get the definite name of the layers in case of error every time.
    • model.summary() is another useful function with which you can check the output shape of each layer.