I am following the tutorial from keras blog (https://blog.keras.io/building-autoencoders-in-keras.html) to build an autoencoder.
I used my own dataset and I am using the following code on my 224*224 size image.
input_img = Input(shape=(224,224,1)) # size of the input image
x = Conv2D(16, (3, 3), activation='relu', padding='same')(input_img)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
encoded = MaxPooling2D((2, 2), padding='same')(x)
# at this point the representation is (4, 4, 8) i.e. 128-dimensional
x = Conv2D(8, (3, 3), activation='relu', padding='same')(encoded)
x = UpSampling2D((2, 2))(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
x = Conv2D(16, (3, 3), activation='relu')(x)
x = UpSampling2D((2, 2))(x)
decoded = Conv2D(1, (3, 3), activation='sigmoid', padding='same')(x)
autoencoder = Model(input_img, decoded)
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')
When I see the summary of autoencoder it gives output such that the last layer has 220 by 220. I have attached a snapshot of that summary.
The thing I don't understand is how does it get converted to 110*110 from 112*112. I was expecting conv2d_6 (Conv2D) to give me 112*112 with 16 kernels.
If I remove Conv2D_6 layer then it will work. But I wanted to have it or else I will be doing UpSampling twice. I don't understand what's wrong.
Can somebody guide me on this?
You need to add padding='same'
to that layer, so it should look like:
x = Conv2D(16, (3, 3), activation='relu', padding='same')(x)
Then it will keep the same dimensions. Without it you do not use any padding, and because your kernel is 3-by-3, your 112*112 transforms to 110*110 after that layer.