[http://deeplearning.net/tutorial/lenet.html#lenet]
in the above link it says Construct the first convolutional pooling layer: filtering reduces the image size to (28-5+1 , 28-5+1) = (24, 24)
convolution of data of size a with filter of size b gives output of size a+b-1. so here the data size 28*28, filter size is 5*5. so the output size should be (28+5-1,28+5-1). it is given as (28-5+1,28-5+1)
It depends on the border_mode
conv2d
uses border_mode='valid'
by default which means (from the scipy documentation)
The output consists only of those elements that do not rely on the zero-padding.
So with border_mode='valid'
and a (5,5)
filter the output is going to be the same size as the input minus a two pixel border, i.e. image_shape - filter_shape + 1
, hence with input size (28,28)
the output is going to be (24,24)
.
The alternative, border_mode='full'
will zero pad the input such that the output is of shape image_shape + filter_shape - 1
.