Confused about the size of the output of convolution layer in theano deep learning tutorial

[http://deeplearning.net/tutorial/lenet.html#lenet]

in the above link it says Construct the first convolutional pooling layer: filtering reduces the image size to (28-5+1 , 28-5+1) = (24, 24)

convolution of data of size a with filter of size b gives output of size a+b-1. so here the data size 28*28, filter size is 5*5. so the output size should be (28+5-1,28+5-1). it is given as (28-5+1,28-5+1)

Solution

It depends on the border_mode

conv2d uses border_mode='valid' by default which means (from the scipy documentation)

The output consists only of those elements that do not rely on the zero-padding.

So with border_mode='valid' and a (5,5) filter the output is going to be the same size as the input minus a two pixel border, i.e. image_shape - filter_shape + 1, hence with input size (28,28) the output is going to be (24,24).

The alternative, border_mode='full' will zero pad the input such that the output is of shape image_shape + filter_shape - 1.