I have read a sequence of images (frames) into a numpy array with shape (9135, 200, 200, 4)
where 9135 is the sample size, 200 is height and width in 4 channel (R-G-B-Depth) images.
I have a sequential model with an LSTM layer:
x_train=np.reshape(x_train,(x_train.shape[0],x_train.shape[1],x_train.shape[2],x_train.shape[3],1))
#(9135, 200, 200, 4, 1)
x_val=np.reshape(x_val,(x_val.shape[0],x_val.shape[1],x_val.shape[2],x_val.shape[3],1))
#(3046, 200, 200, 4, 1)
model = Sequential()
model.add(TimeDistributed(Conv2D(64, (3,3), activation='relu'), input_shape=(200, 200, 4)))
model.add(TimeDistributed(Conv2D(64, (3,3), activation='relu')))
model.add(TimeDistributed(GlobalAveragePooling2D()))
model.add(LSTM(1024, activation='relu', return_sequences=False))
model.add(Dense(1024, activation='relu'))
model.add(Dropout(.5))
model.add(Dense(10, activation='sigmoid'))
model.compile('adam', loss='categorical_crossentropy')
model.summary()
history = model.fit(x_train, y_train, epochs=epochs,batch_size=batch_size,verbose=verbose, validation_data=(x_val, y_val))
but there is an error in the result:
ValueError: Input 0 of layer conv2d is incompatible with the layer: : expected min_ndim=4, found ndim=3. Full shape received: [None, 200, 4]
What is the suggested way to input a 4 channel image into an LSTM layer in Keras?
PS: Εach class has different frames so I do not know how to put unstable timestep
You need to reshape
x_train=np.reshape(x_train,(x_train.shape[0],1,x_train.shape[1],x_train.shape[2],x_train.shape[3]))
#(9135,1 200, 200,4)
x_val=np.reshape(x_val,(x_val.shape[0],1,x_val.shape[1],x_val.shape[2],x_val.shape[3]))
#(3046,1 200, 200,4)
and change the input_shape of model to input_shape=(None,200, 200, 4)))