Search code examples
pythontensorflowkerasneural-networklstm

Data cardinality is ambiguous in shape of input to LSTM layer


Suppose I have these 7 time-series samples:

1,2,3,4,5,6,7

I know there is a relation between each sample and its two earlier ones. It means when you know two earlier samples are 1,2 then you can predict the next one must be 3 and for 2,3 the next one is 4 and so on.

Now I want to train a RNN with a LSTM layer for above samples. What I did is:

import numpy as np
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers

X = np.array([[[1]],[[2]],[[3]],[[4]],[[5]],[[6]],[[7]]])
Y = np.array([[3],[4],[5],[6],[7]])

model = keras.Sequential([
    layers.LSTM(16, input_shape=(2, 1)),
    layers.Dense(1, activation="softmax")
])

model.compile(optimizer="rmsprop",
loss="mse",
metrics=["accuracy"])

model.fit(X, Y, epochs=1, batch_size=1)

But I have encounter with this error:

ValueError: Data cardinality is ambiguous:
  x sizes: 7
  y sizes: 5
Make sure all arrays contain the same number of samples.

I do not know how I have to change the shape of X and Y to solve the problem?


Solution

  • There are couple of issue. First, because of supervision training, you need to make sure that the length of the training paris (X and corresponding Y) are same. Next, as you want to build a model that would take 1 and 2 and predict 3, that's why you also need to prepare your dataloader accordingly such that, it produces two values as X and a target value as Y; for example, for first instance, it might be as follows: X[0]: [[1], [2]] and Y[0]: [3]. Lastly, in your last layer, you used activation softmax which is incorrect to use here, instead it should be linear activaiton. Below is the full working code.

    Data Generator

    data = np.array([1, 2, 3, 4, 5, 6, 7])
    sequences_length = 2
    
    def dataloader(data, sequences_length):
        X, Y = [], []
        for i in range(len(data) - sequences_length):
            X.append(data[i:i+sequences_length])
            Y.append(data[i+sequences_length])
        return np.array(X), np.array(Y)
    
    X, Y = dataloader(data, sequences_length)
    X = np.reshape(X, (X.shape[0], sequences_length , 1))
    
    # check
    for i in range(X.shape[0]):
        print(X[i].reshape(-1), Y[i])
    [1 2] 3
    [2 3] 4
    [3 4] 5
    [4 5] 6
    [5 6] 7
    

    Model

    model = keras.Sequential([
        layers.LSTM(64, input_shape=(sequences_length, 1)),
        layers.Dense(1)
    ])
    model.compile(optimizer="adam", loss="mse")
    model.fit(X, Y, epochs=1000, batch_size=1)
    

    Prediction

    inference_data = np.array([[8, 9]]).reshape(
        1, sequences_length, 1
    )
    model.predict(inference_data)
    1/1 [==============================] - 0s 25ms/step
    array([[9.420095]], dtype=float32)