tensorflow machine-learning keras neural-network lstm

Regressor with LSTM layer keeps returning same value

If I run following code, I am getting array of the same values (predicted), as you can see here:

Basically my input to regressor is array of numbers 0, 1, 2, ... 99, and I expect output to be 100. I am doing this in sequence (multiple times), as you can see in the code. This code should be runnable. What am I doing wrong, and why the expected result, and the outcome is different?

The code:

import numpy as np
import pandas as pd
import tensorflow as tf
import matplotlib.pyplot as plt

from keras.layers import Dense
from keras.layers import LSTM
from keras.models import Sequential
from keras.layers import Dropout
from sklearn.preprocessing import MinMaxScaler
from datetime import datetime
from datetime import timedelta
from time import mktime


my_data = []
for i in range(0, 1000):
    my_data.append(i)

X_train = []
y_train = []

np_data = np.array(my_data)

for i in range(0, np_data.size - 100 ):
    X_train.append(np_data[i : i+100])
    y_train.append(np_data[i+100])

X_train, y_train = np.array(X_train), np.array(y_train)

X_train = np.reshape(X_train, [X_train.shape[0], X_train.shape[1], 1])

regressor = Sequential()

regressor.add(LSTM(units=50, return_sequences=True, input_shape=(X_train.shape[1], 1)))

regressor.add(Dropout(0.2))


regressor.add(LSTM(units=50, return_sequences=True))
regressor.add(Dropout(0.2))

regressor.add(LSTM(units=50, return_sequences=True))
regressor.add(Dropout(0.2))

regressor.add(LSTM(units=50, return_sequences=True))
regressor.add(Dropout(0.2))

regressor.add(LSTM(units=50))
regressor.add(Dropout(0.2))

regressor.add(Dense(units=1))

regressor.compile(optimizer='adam', loss='mean_squared_error')

regressor.fit(X_train, y_train, epochs=5, batch_size=32)


X_test = []
y_test = []

my_data = []
for i in range(1000, 1500):
    my_data.append(i)

np_data = np.array(my_data)

for i in range(0, np_data.size - 100 ):
    X_test.append(np_data[i : i+100])
    y_test.append(np_data[i+100])

X_test = np.array(X_test)

X_test = np.reshape(X_test, [X_test.shape[0], X_test.shape[1], 1])

predicted = regressor.predict(X_test)


plt.plot(y_test, color = '#ffd700', label = "Real Data")
plt.plot(predicted, color = '#1fb864', label = "Predicted Data")

plt.title(" Price Prediction")
plt.xlabel("X axis")
plt.ylabel("Y axis")
plt.legend()
plt.show()

Solution

As I explained in the comment, this is a simple linear problem, so you can use a linear regression. If you want to use keras/tf, you can build a model with a single dense layer, here is a code that will work:

import numpy as np
import pandas as pd
import tensorflow as tf
import matplotlib.pyplot as plt
from keras import optimizers
from keras.layers import Dense
from keras.layers import LSTM
from keras.models import Sequential
from keras.layers import Dropout
from sklearn.preprocessing import MinMaxScaler
from datetime import datetime
from datetime import timedelta
from time import mktime

my_data = []
for i in range(0, 1000):
    my_data.append(i)

X_train = []
y_train = []

np_data = np.array(my_data)

for i in range(0, np_data.size - 100):
    X_train.append(np_data[i: i + 100])
    y_train.append(np_data[i + 100])

X_train, y_train = np.array(X_train), np.array(y_train)

X_train = np.reshape(X_train, [X_train.shape[0], X_train.shape[1]])

regressor = Sequential()

regressor.add(Dense(units=1, input_shape=(len(X_train[1]),)))

regressor.compile(optimizer=optimizers.adam_v2.Adam(learning_rate=0.1), loss='mean_squared_error')

regressor.fit(X_train, y_train, epochs=1000, batch_size=len(X_train))

X_test = []
y_test = []

my_data = []
for i in range(1000, 1500):
    my_data.append(i)

np_data = np.array(my_data)

for i in range(0, np_data.size - 100):
    X_test.append(np_data[i: i + 100])
    y_test.append(np_data[i + 100])

X_test = np.array(X_test)

X_test = np.reshape(X_test, [X_test.shape[0], X_test.shape[1]])

predicted = regressor.predict(X_test)

plt.plot(y_test, color='#ffd700', label="Real Data")
plt.plot(predicted, color='#1fb864', label="Predicted Data")

plt.title(" Price Prediction")
plt.xlabel("X axis")
plt.ylabel("Y axis")
plt.legend()
plt.show()

The code above will result in the desired prediction, here are the changes I made:

changed the model to a single dense layer, as I explained, it is a linear relationship
increase batch size. this is just for faster training, you can reduce if you want, but then you need to decrease learning rate and increase epochs at same time
increase epochs to 1000. This data contains tons of useless information, only the last value of each X is useful, so it takes relatively more epochs to learn this. In fact, it is common to have thousands or even tens of thousands of epochs when using linear regression like this, since each epoch is very fast anyway
reshape data to (num_samples, num_features), which is expected by Dense layer
increase learning rate, just for learning faster

I just modified this to prove my point, I didn't further tune any other params, I am sure you can add regularizers, change learning rate, and so on to make it faster and easier. But honestly, I don't think it is worth time to tune them, since predicting linear relationships is really not what deep learning is for.
Hope this help, feel free to comment if you have further confusion :)