Search code examples
pythontensorflowkeraslstm

How do I correctly use LSTM model to make prediction?


I have following setup:

x,y=load_data_xy("file",[<input headers>],[<target headers>])
b_size=1024
history_length=100
data_gen = TimeseriesGenerator(x.to_numpy(), y.to_numpy(), shuffle=True,
                               length=history_length,
                               batch_size=b_size)

later, I create and train the lstm model, and than, I want to evaluate original data with the model. Here's what I'm doing:

data_gen = TimeseriesGenerator(x.to_numpy(), y.to_numpy(),
                               length=history_length,
                               batch_size=b_size)
prediction_result=[]
for xg,yg in data_gen:
    if len(prediction_result)==0:
        ye=model.predict(xg,batch_size=b_size,verbose=0)
        prediction_result+=ye.tolist()
    else:
        ye=model.predict(xg[-1],batch_size=b_size,verbose=0)
        prediction_result+=ye.tolist()
        
prediction_result=[item for sublist in prediction_result for item in sublist]
print(x.shape)
print(len(prediction_result))

The output of this function is:

(41020, 18)
40960

There are 60 items missing from the prediction, which is a number that I don't know where it is coming from. How do I get outputs in correspondence with the inputs?

UPDATE Here is how I defined the model:

#NORMAL_LAYER_SIZE=
from tensorflow.keras import initializers
INNER_LAYER_SIZE=10
n_input=100
dropout_rate=1./5


model = keras.models.Sequential([
    keras.layers.LSTM(
        x.shape[1], 
        return_sequences=True, 
        batch_input_shape=(b_size, n_input,x.shape[1]), kernel_initializer=tf.keras.initializers.RandomUniform(),dropout=1.*dropout_rate/x.shape[1]
                     )    
])

for i in range(2):
    model.add(tf.keras.layers.BatchNormalization())
    model.add(keras.layers.LSTM(INNER_LAYER_SIZE,return_sequences=True, kernel_initializer=tf.keras.initializers.RandomUniform(),dropout=1.*dropout_rate/INNER_LAYER_SIZE))

model.add(tf.keras.layers.BatchNormalization())
model.add(keras.layers.LSTM(INNER_LAYER_SIZE, kernel_initializer=tf.keras.initializers.RandomUniform(),dropout=1.*dropout_rate/INNER_LAYER_SIZE))
model.add(keras.layers.Dense(INNER_LAYER_SIZE, kernel_initializer=tf.keras.initializers.RandomUniform()))
model.add(tf.keras.layers.BatchNormalization())
model.add(keras.layers.LeakyReLU())
model.add(tf.keras.layers.Dropout(1.0*dropout_rate/INNER_LAYER_SIZE))
model.add(keras.layers.Dense(y.shape[1], kernel_initializer=tf.keras.initializers.RandomUniform()))
model.add(keras.layers.LeakyReLU())

model.compile(loss="mse", metrics=["mean_absolute_error"], optimizer=tf.keras.optimizers.SGD(
    learning_rate=0.1, momentum=0.25, nesterov=True, decay=.001#/x.shape[0]
))#

last_loss=1

model.summary()

Solution

  • It seems that the TimeseriesGenerator gives only full batches (here each with 1024 items), and throws away the remainder. And since 41020 % 1024 is 60, so 60 items are missing, and the generator gives only 40960 items.

    x = np.random.random((41020, 1))
    y = np.random.random((41020, 1))
    b_size=1024
    history_length=100
    data_gen = TimeseriesGenerator(x, y, shuffle=True,
                                   length=history_length,
                                   batch_size=b_size)
    

    Now get the batch sizes produced by the TimeseriesGenerator:

    data_len = [len(batch_x) for batch_x, batch_y in data_gen]
    

    All batches are of length 1024, there is no last batch with size 60:

    set(data_len)
    

    Output:

    {1024}
    

    The number of all items in all batches is:

    sum(data_len)
    

    Output:

    40960
    

    A solution would be to change the batch size to a number which divides 41020, for example 2051:

    x = np.random.random((41020, 1))
    y = np.random.random((41020, 1))
    b_size=2051
    history_length=100
    data_gen = TimeseriesGenerator(x, y, shuffle=True,
                                   length=history_length,
                                   batch_size=b_size)
    data_len = [len(batch_x) for batch_x, batch_y in data_gen]
    
    sum(data_len)
    

    Output:

    41020