Search code examples
tensorflowkerasautoencoderdeconvolutionmax-pooling

expected conv2d_7 to have shape (220, 220, 1) but got array with shape (224, 224, 1)


I am following the tutorial from keras blog (https://blog.keras.io/building-autoencoders-in-keras.html) to build an autoencoder.

I used my own dataset and I am using the following code on my 224*224 size image.

input_img = Input(shape=(224,224,1)) # size of the input image
x = Conv2D(16, (3, 3), activation='relu', padding='same')(input_img)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
encoded = MaxPooling2D((2, 2), padding='same')(x)

# at this point the representation is (4, 4, 8) i.e. 128-dimensional

x = Conv2D(8, (3, 3), activation='relu', padding='same')(encoded)
x = UpSampling2D((2, 2))(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
x = Conv2D(16, (3, 3), activation='relu')(x)
x = UpSampling2D((2, 2))(x)
decoded = Conv2D(1, (3, 3), activation='sigmoid', padding='same')(x)

autoencoder = Model(input_img, decoded)
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')

When I see the summary of autoencoder it gives output such that the last layer has 220 by 220. I have attached a snapshot of that summary.

The thing I don't understand is how does it get converted to 110*110 from 112*112. I was expecting conv2d_6 (Conv2D) to give me 112*112 with 16 kernels.

enter image description here

If I remove Conv2D_6 layer then it will work. But I wanted to have it or else I will be doing UpSampling twice. I don't understand what's wrong.

Can somebody guide me on this?


Solution

  • You need to add padding='same' to that layer, so it should look like:

    x = Conv2D(16, (3, 3), activation='relu', padding='same')(x)
    

    Then it will keep the same dimensions. Without it you do not use any padding, and because your kernel is 3-by-3, your 112*112 transforms to 110*110 after that layer.