Please forgive my ignorance as I am really new to the area. I am trying to get the correct output shape from my neural network which has 3 Conv2D layers then 2 Dense layers. My input shape is (140, 140, 4), which are 4 grayscale images. When I fit in 1 input, I am expecting an output of (1, 4) but I am getting a shape of (14, 14, 4) here. What am I doing wrong here? Thank you very much for your help in advance!
meta_layers = [Conv2D, Conv2D, Conv2D, Dense, Dense]
meta_inits = ['lecun_uniform'] * 5
meta_nodes = [32, 64, 64, 512, 4]
meta_filter = [(8,8), (4,4), (3,3), None, None]
meta_strides = [(4,4), (2,2), (1,1), None, None]
meta_activations = ['relu'] * 5
meta_loss = "mean_squared_error"
meta_optimizer=RMSprop(lr=0.00025, rho=0.9, epsilon=1e-06)
meta_n_samples = 1000
meta_epsilon = 1.0;
meta = Sequential()
meta.add(self.meta_layers[0](self.meta_nodes[0], init=self.meta_inits[0], input_shape=(140, 140, 4), kernel_size=self.meta_filters[0], strides=self.meta_strides[0]))
meta.add(Activation(self.meta_activations[0]))
for layer, init, node, activation, kernel, stride in list(zip(self.meta_layers, self.meta_inits, self.meta_nodes, self.meta_activations, self.meta_filters, self.meta_strides))[1:]:
if(layer == Conv2D):
meta.add(layer(node, init = init, kernel_size = kernel, strides = stride))
meta.add(Activation(activation))
elif(layer == Dense):
meta.add(layer(node, init=init))
meta.add(Activation(activation))
print("meta node: " + str(node))
meta.compile(loss=self.meta_loss, optimizer=self.meta_optimizer)
Your problem lies in the fact that in Keras with version >= 2.0, a Dense
layer is applied to the last channel of the inputs (you may read about it here). So if you apply:
Dense(512)
to a Conv2D
layer with shape (14, 14, 64)
you'll get the output with shape (14, 14, 512)
and then Dense(4)
applied to it will give you output with shape (14, 14, 4)
. You can call model.summary()
method to confirm my words.
In order to solve this you need to apply one of the following layers: GlobalMaxPooling2D
, GlobalAveragePooling2D
or Flatten
to the output from the last convolutional layer in order to squash your output to be only 2 dimensional (with shape (batch_size, features)
.