I have just started with implementing a LSTM in Python with Tensorflow / Keras to test out an idea I had, however I am struggling to properly create a model. This post is mainly about a Value error that I often get (see the code at the bottom), but any and all help with creating a proper LSTM model for the problem below is greatly appreciated.
For each day, I want to predict which of a group of events will occur. The idea is that some events are recurring / always occur after a certain amount of time has passed, whereas other events occur only rarely or without any structure. A LSTM should be able to pick up on these recurring events, in order to predict their occurences for days in the future.
In order to display the events, I use a list with values 0 and 1 (non-occurence and occurence). So for example if I have the events ["Going to school", "Going to the gym" , "Buying a computer"] I have lists like [1, 0, 1], [1, 1, 0], [1, 0, 1], [1, 1, 0] etc. The idea is then that the LSTM will recognize that I go to school every day, the gym every other day and that buying a computer is very rare. So following the sequence of vectors, for the next day it should predict [1,0,0].
So far I have done the following:
x_train[0] consists of day 1,2,...,60 and y_train[0] contains day 61. x_train[1] then contains day 2,...,61 and y_train[1] contains day 62, etc. The idea is that the LSTM should learn to use data from the past 60 days, and that it can then iteratively start predicting/generating new vectors of event occurences for future days.
I am really struggling with how to create a simple implementation of a LSTM that can handle this. So far I think I have figured out the following:
N_INPUTS = 60
and N_FEATURES = 193
. I am not sure what N_BLOCKS
should be, or if the value it should take is strictly bound by some conditions. EDIT: According to https://zhuanlan.zhihu.com/p/58854907 it can be whatever I wantmodel = Sequential()
model.add(LSTM(N_BLOCKS, input_shape=(N_INPUTS, N_FEATURES)))
model.add(layers.Dense(193,activation = 'linear') #or some other activation function
model.add.layers.dropout(0.2)
where the 0.2 is some rate at which things are set to 0. model.compile(loss = ..., optimizer = ...)
. I am not sure if the loss function (e.g. MSE or categorical_crosstentropy) and optimizer matter if I just want a working implementation.model.fit(x_train,y_train)
model.predict(the 60 days before the day I want to predict)
One of my attempts can be seen here:
print(x_train.shape)
print(y_train.shape)
model = keras.Sequential()
model.add(layers.LSTM(256, input_shape=(x_train.shape[1], x_train.shape[2])))
model.add(layers.Dropout(0.2))
model.add(layers.Dense(y_train.shape[2], activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam')
model.summary()
model.fit(x_train,y_train) #<- This line causes the ValueError
Output:
(305, 60, 193)
(305, 1, 193)
Model: "sequential_29"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
lstm_27 (LSTM) (None, 256) 460800
dense_9 (Dense) (None, 1) 257
=================================================================
Total params: 461,057
Trainable params: 461,057
Non-trainable params: 0
_________________________________________________________________
ValueError: Shapes (None, 1, 193) and (None, 193) are incompatible
Alternatively, I have tried replacing the line model.add(layers.Dense(y_train.shape[2], activation='softmax'))
with model.add(layers.Dense(y_train.shape[1], activation='softmax'))
. This produces ValueError: Shapes (None, 1, 193) and (None, 1) are incompatible
.
Are my ideas somewhat okay? How can I resolve this Value Error? Any help would be greatly appreciated.
EDIT: As suggested in the comments, changing the size of y_train
did the trick.
print(x_train.shape)
print(y_train.shape)
model = keras.Sequential()
model.add(layers.LSTM(193, input_shape=(x_train.shape[1], x_train.shape[2]))) #De 193 mag ieder mogelijk getal zijn. zie: https://zhuanlan.zhihu.com/p/58854907
model.add(layers.Dropout(0.2))
model.add(layers.Dense(y_train.shape[1], activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam')
model.summary()
model.fit(x_train,y_train)
(305, 60, 193)
(305, 193)
Model: "sequential_40"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
lstm_38 (LSTM) (None, 193) 298764
dropout_17 (Dropout) (None, 193) 0
dense_16 (Dense) (None, 193) 37442
=================================================================
Total params: 336,206
Trainable params: 336,206
Non-trainable params: 0
_________________________________________________________________
10/10 [==============================] - 3s 89ms/step - loss: 595.5011
Now I am stuck on the fact that model.predict(x)
requires x to be of the same size as x_train, and will output an array with the same size as y_train. I was hoping only one set of 60 days would be required to output the 61th day. Does anyone know how to achieve this?
The solution may be to have y_train of shape (305, 193) instead of (305, 1, 193) as you predict one day, this does not change the data, just its shape. You should then be able to train and predict.
With model.add(layers.Dense(y_train.shape[1], activation='softmax'))
of course.