Since I load my data (images) from the structured folders, I utilize the flow_from_directory
function of the ImageDataGenerator
class, which is provided by Keras
. I've no issues while feeding this data to a CNN
model. But when it comes to an LSTM
model, getting the following error: ValueError: Error when checking input: expected lstm_1_input to have 3 dimensions, but got array with shape (64, 28, 28, 1)
. How can I reduce the dimension of the input data while reading it via ImageDataGenerator
objects to be able to use an LSTM
model instead of a CNN
?
p.s. The shape of the input images is (28, 28)
and they are grayscale
.
train_valid_datagen = ImageDataGenerator(validation_split=0.2)
train_gen = train_valid_datagen.flow_from_directory(
directory=TRAIN_IMAGES_PATH,
target_size=(28, 28),
color_mode='grayscale',
batch_size=64,
class_mode='categorical',
shuffle=True,
subset='training'
)
Update: The LSTM model code:
inp = Input(shape=(28, 28, 1))
inp = Lambda(lambda x: squeeze(x, axis=-1))(inp) # from 4D to 3D
x = LSTM(num_units, dropout=dropout, recurrent_dropout=recurrent_dropout, activation=activation_fn, return_sequences=True)(inp)
x = BatchNormalization()(x)
x = Dense(128, activation=activation_fn)(x)
output = Dense(nb_classes, activation='softmax', kernel_regularizer=l2(0.001))(x)
model = Model(inputs=inp, outputs=output)
you start feeding your network with 4D data like your images in order to have the compatibility with ImageDataGenerator
and then you have to reshape them in 3D format for LSTM.
These are the possibilities:
with only one channel you can simply squeeze the last dimension
inp = Input(shape=(28, 28, 1))
x = Lambda(lambda x: tf.squeeze(x, axis=-1))(inp) # from 4D to 3D
x = LSTM(32)(x)
if you have multiple channels (this is the case of RGB images or if would like to apply a RNN after a Conv2D) a solution can be this
inp = Input(shape=(28, 28, 1))
x = Conv2D(32, 3, padding='same', activation='relu')(inp)
x = Reshape((28,28*32))(x) # from 4D to 3D
x = LSTM(32)(x)
the fit can be computed as always with model.fit_generator
UPDATE: model review
inp = Input(shape=(28, 28, 1))
x = Lambda(lambda x: squeeze(x, axis=-1))(inp) # from 4D to 3D
x = LSTM(32, dropout=dropout, recurrent_dropout=recurrent_dropout, activation=activation_fn, return_sequences=False)(x)
x = BatchNormalization()(x)
x = Dense(128, activation=activation_fn)(x)
output = Dense(nb_classes, activation='softmax', kernel_regularizer=l2(0.001))(x)
model = Model(inputs=inp, outputs=output)
model.summary()
pay attention when you define inp variable (don't overwrite it)
set return_seq = False in LSTM in order to have 2D output