Search code examples
inputkerasneural-networkgenerative-adversarial-network

Is it reasonable to change the input shape for a trained convolutional neural network


I've seen a number of super-resolution networks that seem to imply that it's fine to train a network on inputs of (x,y,d) but then pass in images of arbitrary sizes into a model for prediction, that in Keras for example is specified with the placeholder values (None,None,3) and will accept any size.

for example https://github.com/krasserm/super-resolution is trained on inputs of 24x24x3 but accepts arbitrary sized images for resize, the demo code using 124x118x3.

Is this a sane practice? Does the network when given a larger input simply slide a window over it applying the same weights as it learnt on the smaller size image?


Solution

  • Your guess is correct. Convolutional layers learn to distinguish features at the scale of their kernel, not at the scale of the image as a whole. A layer with a 3x3 kernel will learn to identify a feature up to 3x3 pixels large and will be able to identify that feature in an image whether the image is itself 3x3, 100x100, or 1080x1920.