Search code examples
pythonkeras

preprocess_input() method in keras


I am trying out some sample keras code from this keras documentation page

What does the preprocess_input(x) function of keras module do in the code below? Why do we have to do expand_dims(x, axis=0) before that is passed to the preprocess_input() method?

from keras.applications.resnet50 import ResNet50
from keras.preprocessing import image
from keras.applications.resnet50 import preprocess_input
import numpy as np

model = ResNet50(weights='imagenet')

img_path = 'elephant.jpg'
img = image.load_img(img_path, target_size=(224, 224))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)

Is there any documentation with a good explanation of these functions?

Thanks!


Solution

  • Keras works with batches of images. So, the first dimension is used for the number of samples (or images) you have.

    When you load a single image, you get the shape of one image, which is (size1,size2,channels).

    In order to create a batch of images, you need an additional dimension: (samples, size1,size2,channels)

    The preprocess_input function is meant to adequate your image to the format the model requires.

    Some models use images with values ranging from 0 to 1. Others from -1 to +1. Others use the "caffe" style, that is not normalized, but is centered.

    From the source code, Resnet is using the caffe style.

    You don't need to worry about the internal details of preprocess_input. But ideally, you should load images with the keras functions for that (so you guarantee that the images you load are compatible with preprocess_input).