While trying to use Tensorflow I encountered a little problem regarding the stride.
I have an image of size 67*67, and I want to apply a filter of size 7*7 with stride 3. The output layer should have an edge length of 20 calculated from:
Where n is the output layer edge length (in this case, 20). It is calculated in the follow way:
If we only consider the first row (since other rows are the same), then out of the 67 elements in the first row, the first 7 would go to the first cell of the output layer. Then the filter moves 3 element to the right, which makes the filter covering element 4 to 10, and that would correspond to the 2nd element of the output layer. So on so forth. Every time we advance 3 elements and the total number of times we will advance (counting the first step where it covers 7 elements) is n. Thus the equation I used.
However, the output layer I got from Tensorflow was 23, which is 67/3 and rounded up to the next integer. I don't understand the reasoning behind this.
Can someone explain why it is done like this in Tensorflow?
Thanks!
Output size is computed in two ways depending on the padding you are using. If you are using 'SAME'
padding, the output size is computed as:
out_height = ceil(float(in_height) / float(strides[1]))
out_width = ceil(float(in_width) / float(strides[2]))
Where as with 'VALID'
padding output is computed as:
out_height = ceil(float(in_height - filter_height + 1) / float(strides[1]))
out_width = ceil(float(in_width - filter_width + 1) / float(strides[2]))
Which is what you were using to calculate your output, but we can clearly see you must be using 'SAME'
padding.
So in your case you get:
If you were actually using 'VALID'
padding, the output would be closer to your approximation.
You can read more about how tensorflow calculates feature map sizes and padding here.