Search code examples
tensorflowmachine-learningkerasneural-networklstm

Regressor with LSTM layer keeps returning same value


If I run following code, I am getting array of the same values (predicted), as you can see here: enter image description here

Basically my input to regressor is array of numbers 0, 1, 2, ... 99, and I expect output to be 100. I am doing this in sequence (multiple times), as you can see in the code. This code should be runnable. What am I doing wrong, and why the expected result, and the outcome is different?

enter image description here

The code:

import numpy as np
import pandas as pd
import tensorflow as tf
import matplotlib.pyplot as plt

from keras.layers import Dense
from keras.layers import LSTM
from keras.models import Sequential
from keras.layers import Dropout
from sklearn.preprocessing import MinMaxScaler
from datetime import datetime
from datetime import timedelta
from time import mktime


my_data = []
for i in range(0, 1000):
    my_data.append(i)

X_train = []
y_train = []

np_data = np.array(my_data)

for i in range(0, np_data.size - 100 ):
    X_train.append(np_data[i : i+100])
    y_train.append(np_data[i+100])

X_train, y_train = np.array(X_train), np.array(y_train)

X_train = np.reshape(X_train, [X_train.shape[0], X_train.shape[1], 1])

regressor = Sequential()

regressor.add(LSTM(units=50, return_sequences=True, input_shape=(X_train.shape[1], 1)))

regressor.add(Dropout(0.2))


regressor.add(LSTM(units=50, return_sequences=True))
regressor.add(Dropout(0.2))

regressor.add(LSTM(units=50, return_sequences=True))
regressor.add(Dropout(0.2))

regressor.add(LSTM(units=50, return_sequences=True))
regressor.add(Dropout(0.2))

regressor.add(LSTM(units=50))
regressor.add(Dropout(0.2))

regressor.add(Dense(units=1))

regressor.compile(optimizer='adam', loss='mean_squared_error')

regressor.fit(X_train, y_train, epochs=5, batch_size=32)


X_test = []
y_test = []

my_data = []
for i in range(1000, 1500):
    my_data.append(i)

np_data = np.array(my_data)

for i in range(0, np_data.size - 100 ):
    X_test.append(np_data[i : i+100])
    y_test.append(np_data[i+100])

X_test = np.array(X_test)

X_test = np.reshape(X_test, [X_test.shape[0], X_test.shape[1], 1])

predicted = regressor.predict(X_test)


plt.plot(y_test, color = '#ffd700', label = "Real Data")
plt.plot(predicted, color = '#1fb864', label = "Predicted Data")

plt.title(" Price Prediction")
plt.xlabel("X axis")
plt.ylabel("Y axis")
plt.legend()
plt.show()

Solution

  • As I explained in the comment, this is a simple linear problem, so you can use a linear regression. If you want to use keras/tf, you can build a model with a single dense layer, here is a code that will work:

    import numpy as np
    import pandas as pd
    import tensorflow as tf
    import matplotlib.pyplot as plt
    from keras import optimizers
    from keras.layers import Dense
    from keras.layers import LSTM
    from keras.models import Sequential
    from keras.layers import Dropout
    from sklearn.preprocessing import MinMaxScaler
    from datetime import datetime
    from datetime import timedelta
    from time import mktime
    
    my_data = []
    for i in range(0, 1000):
        my_data.append(i)
    
    X_train = []
    y_train = []
    
    np_data = np.array(my_data)
    
    for i in range(0, np_data.size - 100):
        X_train.append(np_data[i: i + 100])
        y_train.append(np_data[i + 100])
    
    X_train, y_train = np.array(X_train), np.array(y_train)
    
    X_train = np.reshape(X_train, [X_train.shape[0], X_train.shape[1]])
    
    regressor = Sequential()
    
    regressor.add(Dense(units=1, input_shape=(len(X_train[1]),)))
    
    regressor.compile(optimizer=optimizers.adam_v2.Adam(learning_rate=0.1), loss='mean_squared_error')
    
    regressor.fit(X_train, y_train, epochs=1000, batch_size=len(X_train))
    
    X_test = []
    y_test = []
    
    my_data = []
    for i in range(1000, 1500):
        my_data.append(i)
    
    np_data = np.array(my_data)
    
    for i in range(0, np_data.size - 100):
        X_test.append(np_data[i: i + 100])
        y_test.append(np_data[i + 100])
    
    X_test = np.array(X_test)
    
    X_test = np.reshape(X_test, [X_test.shape[0], X_test.shape[1]])
    
    predicted = regressor.predict(X_test)
    
    plt.plot(y_test, color='#ffd700', label="Real Data")
    plt.plot(predicted, color='#1fb864', label="Predicted Data")
    
    plt.title(" Price Prediction")
    plt.xlabel("X axis")
    plt.ylabel("Y axis")
    plt.legend()
    plt.show()
    

    The code above will result in the desired prediction, here are the changes I made:

    1. changed the model to a single dense layer, as I explained, it is a linear relationship
    2. increase batch size. this is just for faster training, you can reduce if you want, but then you need to decrease learning rate and increase epochs at same time
    3. increase epochs to 1000. This data contains tons of useless information, only the last value of each X is useful, so it takes relatively more epochs to learn this. In fact, it is common to have thousands or even tens of thousands of epochs when using linear regression like this, since each epoch is very fast anyway
    4. reshape data to (num_samples, num_features), which is expected by Dense layer
    5. increase learning rate, just for learning faster

    I just modified this to prove my point, I didn't further tune any other params, I am sure you can add regularizers, change learning rate, and so on to make it faster and easier. But honestly, I don't think it is worth time to tune them, since predicting linear relationships is really not what deep learning is for.
    Hope this help, feel free to comment if you have further confusion :)