Search code examples
pythontensorflowcomputer-visionconv-neural-networkimagenet

Issue with Imagenet classification with VGG16 pretrained weights


I was trying to run a vanilla Image net classification with VGG16 network in tensorflow (which gives out VGG16 through Keras backbone).

However when I tried to run classification on a sample elephant image it is giving completely unexpected results.

I am not able to figure out what might be the issue.

Here is the complete code I used:

import tensorflow as tf
import numpy as np
from PIL import Image
from tensorflow.python.keras._impl.keras.applications import imagenet_utils


model = tf.keras.applications.VGG16()
VGG = model.graph

VGG.get_operations()
input = VGG.get_tensor_by_name("input_1:0")
output = VGG.get_tensor_by_name("predictions/Softmax:0")
print(input)
print(output)

I = Image.open("Elephant.jpg")
new_img = I.resize((224,224))
image_array = np.array(new_img)[:, :, 0:3]
image_array = np.expand_dims(image_array, axis=0)


with tf.Session(graph=VGG) as sess:
    init_op = tf.global_variables_initializer()
    sess.run(init_op)
    pred = (sess.run(output,{input:image_array}))
    print(imagenet_utils.decode_predictions(pred))

The below is the sample output I get:

Tensor("input_1:0", shape=(?, 224, 224, 3), dtype=float32)
Tensor("predictions/Softmax:0", shape=(?, 1000), dtype=float32)

[[('n02281406', 'sulphur_butterfly', 0.0022673723), ('n01882714', 'koala', 0.0021256246), ('n04325704', 'stole', 0.0020583202), ('n01496331', 'electric_ray', 0.0020416214), ('n01797886', 'ruffed_grouse', 0.0020229272)]]

From the probablities it loooks like there is something wrong with the passed Image data (as all are very low).

But I couldn't figure out what is wrong.
And I am very sure the image is of an elephant as a human!


Solution

  • It seems that we can (or need to?) use the session from Keras (which has the associated loaded graph with weights) instead of creating a new session in Tensorflow and using the graph obtained from Keras model like below

    VGG = model.graph  
    

    I think the graph got above has no weights in it (and that is the reason predictions are wrong) and the graph from Keras session as the proper weights (so those two graph instances should be different)

    Below is the full code:

    import tensorflow as tf
    import numpy as np
    from PIL import Image
    from tensorflow.python.keras._impl.keras.applications import imagenet_utils
    from tensorflow.python.keras._impl.keras import backend as K
    
    
    model = tf.keras.applications.VGG16()
    sess = K.get_session()
    VGG = model.graph #Not needed and also doesnt have weights in it
    
    VGG.get_operations()
    input = VGG.get_tensor_by_name("input_1:0")
    output = VGG.get_tensor_by_name("predictions/Softmax:0")
    print(input)
    print(output)
    
    I = Image.open("Elephant.jpg")
    new_img = I.resize((224,224))
    image_array = np.array(new_img)[:, :, 0:3]
    image_array = np.expand_dims(image_array, axis=0)
    image_array = image_array.astype(np.float32)
    image_array = tf.keras.applications.vgg16.preprocess_input(image_array)
    
    pred = (sess.run(output,{input:image_array}))
    print(imagenet_utils.decode_predictions(pred))
    

    And this gives the expected result:

    [[('n02504458', 'African_elephant', 0.8518132), ('n01871265', 'tusker', 0.1398836), ('n02504013', 'Indian_elephant', 0.0082286), ('n01704323', 'triceratops', 6.965483e-05), ('n02397096', 'warthog', 1.8662439e-06)]]

    Thanks Idavid for the tip about using preprocess_input() function and Nicolas for tips about unloaded weights.