I am trying out some sample keras
code from this keras
documentation page
What does the preprocess_input(x)
function of keras
module do in the code below? Why do we have to do expand_dims(x, axis=0)
before that is passed to the preprocess_input()
method?
from keras.applications.resnet50 import ResNet50
from keras.preprocessing import image
from keras.applications.resnet50 import preprocess_input
import numpy as np
model = ResNet50(weights='imagenet')
img_path = 'elephant.jpg'
img = image.load_img(img_path, target_size=(224, 224))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)
Is there any documentation with a good explanation of these functions?
Thanks!
Keras works with batches of images. So, the first dimension is used for the number of samples (or images) you have.
When you load a single image, you get the shape of one image, which is (size1,size2,channels)
.
In order to create a batch of images, you need an additional dimension: (samples, size1,size2,channels)
The preprocess_input
function is meant to adequate your image to the format the model requires.
Some models use images with values ranging from 0 to 1. Others from -1 to +1. Others use the "caffe" style, that is not normalized, but is centered.
From the source code, Resnet is using the caffe style.
You don't need to worry about the internal details of preprocess_input
. But ideally, you should load images with the keras functions for that (so you guarantee that the images you load are compatible with preprocess_input
).