I'm trying to program a simple example to understand how LSTMs work. I want to take a simple integer series 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, and predict the next number. I've got a code, but I don't know what the second argument of the fit method needs to be.
import pandas as pd
from sklearn.preprocessing import MinMaxScaler
import numpy as np
from keras.models import Sequential
from keras.layers import LSTM
df = pd.DataFrame(columns = ['Serie'])
for i in range(0, 10):
df.loc[i, 'Serie'] = i
sc = MinMaxScaler(feature_range = (0, 1))
train_set = sc.fit_transform(df.iloc[:, [True]])
xTrain = []
for i in range(0, len(train_set) - 3):
xTrain.append(train_set[i:i + 3, 0])
xTrain = np.array(xTrain)
xTrain = np.reshape(xTrain, (xTrain.shape[0], xTrain.shape[1], 1))
regresor = Sequential()
regresor.add(LSTM(units = 1, input_shape = (3, 1)))
regresor.compile(optimizer = 'rmsprop', loss = 'mse')
regresor.fit(xTrain, ???, batch_size = 1)
Can someone give me a very simple example of this?
You need to set the problem as a supervised one. Every sample contains the independent variable x
and the dependent variable y
. Based on your question, x
contains samples of 3 timesteps and 1 feature. Start off by doing the necessary imports:
import pandas as pd
from sklearn.preprocessing import MinMaxScaler
import numpy as np
import tensorflow as tf
Let's define some constants:
points = 30 # number of data points to generate
timesteps = 3 # number of time steps per sample as LSTM layers need input shape (samples, time steps, features)
features = 1 # number of features per time step as LSTM layers need input shape (samples, time steps, features)
A sequence generation from 0 ... 30:
x = np.arange(points + 1) # array([ 0, 1, ..., 29, 30])
Here is where we start setting the problem as a supervised one with x
as a sequence of numbers and y
as sequence of next numbers:
y = x[1:] # [ 1, 2, ..., 29, 30 ]
x = x[:30] # [ 0, 1, ..., 28, 29 ]
Put both x
and y
together for scaling:
dataset = np.hstack((x.reshape((points, 1)),y.reshape((points, 1))))
scaler = MinMaxScaler((0, 1))
scaled = scaler.fit_transform(dataset)
Let's define the inputs and outputs of our model:
x_train = scaled[:,0] # first column
x_train = x_train.reshape((points // timesteps, timesteps, features)) # as i stated before LSTM layers need input shape (samples, time steps, features)
y_train = scaled[:,1] # second column
y_train = y_train[2::3] # start at the third element in steps of 3, for a total of 10
Model definition and compilation. I decided to make the model architecture a little more robust for "better" performance (see the results below):
regresor = tf.keras.models.Sequential()
regresor.add(tf.keras.layers.LSTM(units = 4, return_sequences = True))
regresor.add(tf.keras.layers.LSTM(units = 2))
regresor.add(tf.keras.layers.Dense(units = 1))
regresor.compile(optimizer = 'rmsprop', loss = 'mse')
Train the model: regresor.fit(x_train, y_train, batch_size = 2, epochs = 500, verbose = 1)
Some predictions: y_hats = regresor.predict(x_train)
The results;
real y predicted y
0.068966 0.086510
0.172414 0.162209
0.275862 0.252749
0.379310 0.356117
0.482759 0.467885
0.586207 0.582081
0.689655 0.692756
0.793103 0.795362
0.896552 0.887317
1.000000 0.967796
As you can see, the predictions are close enough to the real values.
A plot of the results:
Note that for simplicity I performed the predictions on the training data set, the testing should be done on test data. For that, you will have to generate more points and split them accordingly (70% training, 30% testing). Also, you can obtain the values in the original range by calling the scaler's inverse_transform
methods.