Convolutional neural network, how the second conv layer works on the first pooling layer

I'm reading material from the TensorFlow website:

https://www.tensorflow.org/tutorials/layers

Suppose we have 10 greyscale monochrome 28x28 pixel images,

If we apply 32 5x5 convolutional filters with 0 padding in the 1st conv layer, we get 10*32*28*28 data.
If We apply 2x2 max pooling with stride 2 in the 1st pooling, we get 10*32*14*14 data.
By now, one image has become a 14*14 size image with 32 channels.

So, if we apply a second convolutional layer(let's say 64 5x5 filters as in the link), do we apply these filters to each channel of each image and get 10*32*64*14*14 data?

Solution

Yes and No. You do apply the filters to each channel and each image, but you don't get 10*32*64*14*14 output dimensions. The dimensionality of the output is going to be 10*64*14*14, because the layer specified 64 output channels per image. In turn, the weights used for this convolution will have size 32*64*5*5 (64 5-by-5 filters for every channel on the input).