I am not understanding how LSTM layers are fed with data.
LSTM layers requires three dimensions (x,y,z)
.
I do have a dataset of time series: 2900 rows in total, which should conceptually divided into groups of 23 consecutive rows where each row is described by 178 features. Conceptually every 23 rows I have a new sequence 23 rows long regarding a new patient.
Are the following statements right?
x
samples = # of bunches of sequences 23 rows long - namely len(dataframe)/23
y
time steps = length of the each sequence - by domain assumption 23 here.z
feature size = # of columns for each row - 178 in this case.Therefore x*y = "# of rows in the dataset"
Assuming this is correct, what's a batch size while training a model in this case?
Might be the number of samples considered in an epoch while training?
Therefore by having x
(# of samples) equal to 200, it makes no sense to set a batch_size
greater than 200, because that's my upper limit - I don't have more data to train on.
I interpret your description as saying that your total dataset is of 2900 data samples. Where each data sample has 23 time slots, each with a vector of 178 dimensions.
If that the case the input_shape for your model should be defined as (23, 178). The batch size is simple the number of samples (out of the 2900) that will be used for a training / test / prediction run.
Try the following:
from keras.models import Sequential
from keras.layers import Dense, LSTM
model = Sequential()
model.add(LSTM(64, input_shape=(23,178)))
model.compile(loss='mse', optimizer='sgd')
model.summary()
print model.input
This is just a simplistic model that outputs a single 64-wide vector for each sample. You will see that the expected model.input is:
Tensor("lstm_3_input:0", shape=(?, 23, 178), dtype=float32)
The batch_size is unset in the input shape which means that the model can be used to train / predict batches of different sizes.