Search code examples
pythontensorflowkeras

Dimensionality error when fitting model in Keras


I'm trying to build a model using Keras (Tensorflow backend):

def build_model(args):
    # Define the input nodes
    text1 = build_input_node('text1', args.batch_size, args.time_steps)
    text2 = build_input_node('text2', args.batch_size, args.time_steps)

    # Create the shared LSTM node
    shared_lstm = LSTM(INPUT_SIZE, stateful=args.stateful)

    # Run inputs through shared layer
    encoded1 = shared_lstm(text1)
    encoded2 = shared_lstm(text2)

    # Concatenate outputs to form a tensor of shape (2*batch_size, INPUT_SIZE)
    concatenated = concatenate([encoded1, encoded2], axis=0)

    # Input shape: (2*batch_size, INPUT_SIZE)
    # Output shape: (2*batch_size, batch_size)
    dense1 = Dense(args.batch_size,
                   input_shape=(2 * args.batch_size, INPUT_SIZE),
                   activation='sigmoid')(concatenated)

    # Input shape: (2*batch_size, batch_size)
    # Output shape: (2*batch_size, 1)
    output_shape = (2 * args.batch_size, 1)
    output = Dense(1,
                   input_shape=(2 * args.batch_size, args.batch_size),
                   activation='sigmoid')(dense1)

    model = Model(inputs=[text1, text2], outputs=output)
    optimizer = build_optimizer(name=args.optimizer, lr=args.learning_rate)
    model.compile(loss=args.loss,
                  optimizer=optimizer,
                  metrics=['accuracy'])
    return model, output_shape

The data which is supposed to be fed into the model is reshaped to fit the output_shape variable:

def build_datasets(input, time_steps, output_shape):
    T1 = []
    T2 = []
    Y = []
    for sentence1, sentence2, score in input:
        T1.append([t.vector for t in nlp(sentence1)])
        T2.append([t.vector for t in nlp(sentence2)])
        Y.append(np.full(output_shape, score))

    T1 = pad_and_reshape(T1, time_steps)
    T2 = pad_and_reshape(T2, time_steps)

    X = [T1, T2]
    Y = np.asarray(Y)
    # fit the scores between 0 and 1
    Y = expit(Y)
    return X, Y

But when I call model.fit(X, Y, epochs=100, batch_size=8) it throws the following error:

ValueError: Error when checking target: expected dense_34 to have 2 dimensions, but got array with shape (1468, 16, 1)

where 1468 is the number of samples, 16 is 2*batch_size.

What am I doing wrong? How can I get the proper shape for the output node?

Model summary:

__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to
==================================================================================================
text1 (InputLayer)              (8, 15, 384)         0
__________________________________________________________________________________________________
text2 (InputLayer)              (8, 15, 384)         0
__________________________________________________________________________________________________
lstm_1 (LSTM)                   (8, 384)             1181184     text1[0][0]
                                                                 text2[0][0]
__________________________________________________________________________________________________
concatenate_1 (Concatenate)     (16, 384)            0           lstm_1[0][0]
                                                                 lstm_1[1][0]
__________________________________________________________________________________________________
dense_1 (Dense)                 (16, 8)              3080        concatenate_1[0][0]
__________________________________________________________________________________________________
dense_2 (Dense)                 (16, 1)              9           dense_1[0][0]
__________________________________________________________________________________________________

Total params: 1,184,273

Trainable params: 1,184,273

Non-trainable params: 0

Solution

  • After debugging a bit through keras code I found out that keras adjusts the dimensionality of Y with this line:

    y = _standardize_input_data(y, self._feed_output_names,
                                output_shapes,
                                check_batch_axis=False,
                                exception_prefix='target')
    

    which in its turn calls

    data = [np.expand_dims(x, 1) if x is not None and x.ndim == 1 else x for x in data]
    

    and since my y had the shape (1468, 16, 1) it threw an error later when validating.

    The fix was to replace Y.append(np.full(output_shape, score)) with Y.append(score).