It is not clear to me whether there is any difference between specifying the input dimension Input(shape=(20,))
or not Input(shape=(None,))
in the following example:
input_layer = Input(shape=(None,))
emb = Embedding(86, 300) (input_layer)
lstm = Bidirectional(LSTM(300)) (emb)
output_layer = Dense(10, activation="softmax") (lstm)
model = Model(input_layer, output_layer)
model.compile(optimizer="rmsprop", loss="categorical_crossentropy", metrics=["acc"])
history = model.fit(my_x, my_y, epochs=1, batch_size=632, validation_split=0.1)
my_x
(shape: 2000, 20) contains integers referring to characters, while my_y
contains the one-hot encoding of some labels. With Input(shape=(None,))
, I see that I could use model.predict(my_x[:, 0:10])
, i.e., I could give only 10 characters as an input instead of 20: how is that possible? I was assuming that all the 20 dimensions in my_x
were needed to predict the corresponding y.
What you say with None
is, that the sequences you feed into the model have the strict length of 20. While a model usually needs a fixed length, recurrent neural networks (as the LSTM you use there), do not need a fixed sequence Length. So the LSTM just does not care whether your sequence contains 20 or 100 timesteps, as it simply loops over them. However, when you specify the amount of timesteps to 20, the LSTM expects 20 and will raise an error if it does not get them.
For more information see this post of Tim♦