python machine-learning keras lstm recurrent-neural-network

Understanding LSTMs - layers data dimensions

I am not understanding how LSTM layers are fed with data.

LSTM layers requires three dimensions (x,y,z).

I do have a dataset of time series: 2900 rows in total, which should conceptually divided into groups of 23 consecutive rows where each row is described by 178 features. Conceptually every 23 rows I have a new sequence 23 rows long regarding a new patient.

Are the following statements right?

x samples = # of bunches of sequences 23 rows long - namely len(dataframe)/23
y time steps = length of the each sequence - by domain assumption 23 here.
z feature size = # of columns for each row - 178 in this case.

Therefore x*y = "# of rows in the dataset"

Assuming this is correct, what's a batch size while training a model in this case?

Might be the number of samples considered in an epoch while training?

Therefore by having x(# of samples) equal to 200, it makes no sense to set a batch_size greater than 200, because that's my upper limit - I don't have more data to train on.

Solution

I interpret your description as saying that your total dataset is of 2900 data samples. Where each data sample has 23 time slots, each with a vector of 178 dimensions.

If that the case the input_shape for your model should be defined as (23, 178). The batch size is simple the number of samples (out of the 2900) that will be used for a training / test / prediction run.

Try the following:

from keras.models import Sequential
from keras.layers import Dense, LSTM


model = Sequential()
model.add(LSTM(64, input_shape=(23,178)))
model.compile(loss='mse', optimizer='sgd')
model.summary()

print model.input

This is just a simplistic model that outputs a single 64-wide vector for each sample. You will see that the expected model.input is:

Tensor("lstm_3_input:0", shape=(?, 23, 178), dtype=float32)

The batch_size is unset in the input shape which means that the model can be used to train / predict batches of different sizes.