I was playing around with Tensorflow for image classification. I used the image_retraining/retrain.py to retrain the inception library with new categories and used it to classify images using label_image.py from https://github.com/llSourcell/tensorflow_image_classifier/blob/master/src/label_image.py as below:
import tensorflow as tf
import sys
# change this as you see fit
image_path = sys.argv[1]
# Read in the image_data
image_data = tf.gfile.FastGFile(image_path, 'rb').read()
# Loads label file, strips off carriage return
label_lines = [line.rstrip() for line
in tf.gfile.GFile("/root/tf_files/output_labels.txt")]
# Unpersists graph from file
with tf.gfile.FastGFile("/root/tf_files/output_graph.pb", 'rb') as f:
graph_def = tf.GraphDef()
graph_def.ParseFromString(f.read())
_ = tf.import_graph_def(graph_def, name='')
with tf.Session() as sess:
# Feed the image_data as input to the graph and get first prediction
softmax_tensor = sess.graph.get_tensor_by_name('final_result:0')
#predictions = sess.run(softmax_tensor,{'DecodeJpeg/contents:0': image_data})
predictions = sess.run(softmax_tensor,{'DecodePng/contents:0': image_data})
# Sort to show labels of first prediction in order of confidence
top_k = predictions[0].argsort()[-len(predictions[0]):][::-1]
for node_id in top_k:
human_string = label_lines[node_id]
score = predictions[0][node_id]
print('%s (score = %.5f)' % (human_string, score))
I noticed two issues. When I retrain with new categories, it only trains JPG images. I am a noob in machine learning so not sure whether this is a limitation or is it possible to train other extension images like PNG, GIF?
Another one is when classifying the images the input is again only for JPG. I tried to change DecodeJpeg to DecodePng in label_image.py above but couldn't work. Another way I tried was to convert other formats into JPG before passing them in for classification like:
im = Image.open('/root/Desktop/200_s.gif').convert('RGB')
im.save('/root/Desktop/test.jpg', "JPEG")
image_path1 = '/root/Desktop/test.jpg'
Is there any other way to do this? Does Tensorflow have functions to handle other image formats other than JPG?
I tried the following by feeding in parsed image as compared to JPEG as suggested by @mrry
import tensorflow as tf
import sys
import numpy as np
from PIL import Image
# change this as you see fit
image_path = sys.argv[1]
# Read in the image_data
image_data = tf.gfile.FastGFile(image_path, 'rb').read()
image = Image.open(image_path)
image_array = np.array(image)[:,:,0:3] # Select RGB channels only.
# Loads label file, strips off carriage return
label_lines = [line.rstrip() for line
in tf.gfile.GFile("/root/tf_files/output_labels.txt")]
# Unpersists graph from file
with tf.gfile.FastGFile("/root/tf_files/output_graph.pb", 'rb') as f:
graph_def = tf.GraphDef()
graph_def.ParseFromString(f.read())
_ = tf.import_graph_def(graph_def, name='')
with tf.Session() as sess:
# Feed the image_data as input to the graph and get first prediction
softmax_tensor = sess.graph.get_tensor_by_name('final_result:0')
predictions = sess.run(softmax_tensor,{'DecodeJpeg:0': image_array})
# Sort to show labels of first prediction in order of confidence
top_k = predictions[0].argsort()[-len(predictions[0]):][::-1]
for node_id in top_k:
human_string = label_lines[node_id]
score = predictions[0][node_id]
print('%s (score = %.5f)' % (human_string, score))
It works for JPEG images but when I use PNG or GIF it throws
Traceback (most recent call last):
File "label_image.py", line 17, in <module>
image_array = np.array(image)[:,:,0:3] # Select RGB channels only.
IndexError: too many indices for array
The model can only train on (and evaluate) JPEG images, because the GraphDef
that you've saved in /root/tf_files/output_graph.pb
only contains a tf.image.decode_jpeg()
op, and uses the output of that op for making predictions. There are at least a couple of options for using other image formats:
Feed in parsed images rather than JPEG data. In the current program, you feed in a JPEG-encoded image as a string value for the tensor "DecodeJpeg/contents:0"
. Instead, you can feed in a 3-D array of decoded image data for the tensor "DecodeJpeg:0"
(which represents the output of the tf.image.decode_jpeg()
op), and you can use NumPy, PIL, or some other Python library to create this array.
Remap the image input in tf.import_graph_def()
. The tf.import_graph_def()
function enables you to connect two different graphs together by remapping individual tensor values. For example, you could do something like the following to add a new image-processing op to the existing graph:
image_string_input = tf.placeholder(tf.string)
image_decoded = tf.image.decode_png(image_string_input)
# Unpersists graph from file
with tf.gfile.FastGFile("/root/tf_files/output_graph.pb", 'rb') as f:
graph_def = tf.GraphDef()
graph_def.ParseFromString(f.read())
softmax_tensor, = tf.import_graph_def(
graph_def,
input_map={"DecodeJpeg:0": image_decoded},
return_operations=["final_result:0"])
with tf.Session() as sess:
# Feed the image_data as input to the graph and get first prediction
predictions = sess.run(softmax_tensor, {image_string_input: image_data})
# ...