python tensorflow computer-vision conv-neural-network imagenet

Issue with Imagenet classification with VGG16 pretrained weights

I was trying to run a vanilla Image net classification with VGG16 network in tensorflow (which gives out VGG16 through Keras backbone).

However when I tried to run classification on a sample elephant image it is giving completely unexpected results.

I am not able to figure out what might be the issue.

Here is the complete code I used:

import tensorflow as tf
import numpy as np
from PIL import Image
from tensorflow.python.keras._impl.keras.applications import imagenet_utils


model = tf.keras.applications.VGG16()
VGG = model.graph

VGG.get_operations()
input = VGG.get_tensor_by_name("input_1:0")
output = VGG.get_tensor_by_name("predictions/Softmax:0")
print(input)
print(output)

I = Image.open("Elephant.jpg")
new_img = I.resize((224,224))
image_array = np.array(new_img)[:, :, 0:3]
image_array = np.expand_dims(image_array, axis=0)


with tf.Session(graph=VGG) as sess:
    init_op = tf.global_variables_initializer()
    sess.run(init_op)
    pred = (sess.run(output,{input:image_array}))
    print(imagenet_utils.decode_predictions(pred))

The below is the sample output I get:

Tensor("input_1:0", shape=(?, 224, 224, 3), dtype=float32)
Tensor("predictions/Softmax:0", shape=(?, 1000), dtype=float32)

[[('n02281406', 'sulphur_butterfly', 0.0022673723), ('n01882714', 'koala', 0.0021256246), ('n04325704', 'stole', 0.0020583202), ('n01496331', 'electric_ray', 0.0020416214), ('n01797886', 'ruffed_grouse', 0.0020229272)]]

From the probablities it loooks like there is something wrong with the passed Image data (as all are very low).

But I couldn't figure out what is wrong.
And I am very sure the image is of an elephant as a human!

Solution

It seems that we can (or need to?) use the session from Keras (which has the associated loaded graph with weights) instead of creating a new session in Tensorflow and using the graph obtained from Keras model like below

VGG = model.graph

I think the graph got above has no weights in it (and that is the reason predictions are wrong) and the graph from Keras session as the proper weights (so those two graph instances should be different)

Below is the full code:

import tensorflow as tf
import numpy as np
from PIL import Image
from tensorflow.python.keras._impl.keras.applications import imagenet_utils
from tensorflow.python.keras._impl.keras import backend as K


model = tf.keras.applications.VGG16()
sess = K.get_session()
VGG = model.graph #Not needed and also doesnt have weights in it

VGG.get_operations()
input = VGG.get_tensor_by_name("input_1:0")
output = VGG.get_tensor_by_name("predictions/Softmax:0")
print(input)
print(output)

I = Image.open("Elephant.jpg")
new_img = I.resize((224,224))
image_array = np.array(new_img)[:, :, 0:3]
image_array = np.expand_dims(image_array, axis=0)
image_array = image_array.astype(np.float32)
image_array = tf.keras.applications.vgg16.preprocess_input(image_array)

pred = (sess.run(output,{input:image_array}))
print(imagenet_utils.decode_predictions(pred))

And this gives the expected result:

[[('n02504458', 'African_elephant', 0.8518132), ('n01871265', 'tusker', 0.1398836), ('n02504013', 'Indian_elephant', 0.0082286), ('n01704323', 'triceratops', 6.965483e-05), ('n02397096', 'warthog', 1.8662439e-06)]]

Thanks Idavid for the tip about using preprocess_input() function and Nicolas for tips about unloaded weights.