I have a raspberry pi 4, and I want to do object detection at a good frame rate. I tried tensorflow and YOLO but both run at 1 fps. So I am trying TensorFlow Lite. I have downloaded the tflite file and the labelmap.txt file. I have used this link to try to run inference. Here I faced a problem. I do not understand how to get the results (classification, coor for bounding box and conf) out of the output.
Here is my code:
import tensorflow as tf
import numpy as np
import cv2
interpreter = tf.lite.Interpreter(model_path="/content/drive/My Drive/detect.tflite")
interpreter.allocate_tensors()
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()
print(input_details)
print(output_details)
print()
input_shape = input_details[0]['shape']
im = cv2.imread("/content/drive/My Drive/doggy.jpg")
im_rgb = cv2.cvtColor(im, cv2.COLOR_BGR2RGB)
im_rgb = cv2.resize(im_rgb, (input_shape[1], input_shape[2]))
input_data = np.expand_dims(im_rgb, axis=0)
print(input_data.shape)
print()
interpreter.set_tensor(input_details[0]['index'], input_data)
interpreter.invoke()
output_data = interpreter.get_tensor(output_details[0]['index'])
print(output_data.shape)
print()
print(output_data)
And here is my output:
[{'name': 'normalized_input_image_tensor', 'index': 175, 'shape': array([ 1, 300, 300, 3], dtype=int32), 'dtype': <class 'numpy.uint8'>, 'quantization': (0.0078125, 128)}]
[{'name': 'TFLite_Detection_PostProcess', 'index': 167, 'shape': array([ 1, 10, 4], dtype=int32), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0)}, {'name': 'TFLite_Detection_PostProcess:1', 'index': 168, 'shape': array([ 1, 10], dtype=int32), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0)}, {'name': 'TFLite_Detection_PostProcess:2', 'index': 169, 'shape': array([ 1, 10], dtype=int32), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0)}, {'name': 'TFLite_Detection_PostProcess:3', 'index': 170, 'shape': array([1], dtype=int32), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0)}]
(1, 300, 300, 3)
(1, 10, 4)
[[[ 1.66415479e-02 5.48024022e-04 8.67791831e-01 3.35325867e-01]
[ 7.41335377e-02 3.22245747e-01 9.64617252e-01 9.71388936e-01]
[-2.11861148e-03 5.41743517e-01 2.60241032e-01 7.02846169e-01]
[-5.67546487e-03 3.26282382e-01 8.59034657e-01 6.30770981e-01]
[ 7.27111334e-03 7.90268779e-01 2.86753297e-01 9.56545353e-01]
[ 2.07318692e-03 7.96441555e-01 5.48386931e-01 9.96111989e-01]
[-1.04907183e-02 2.38761827e-01 6.75976276e-01 7.01156497e-01]
[ 3.12007014e-02 1.34294275e-02 5.82291842e-01 3.10949832e-01]
[-1.95578858e-03 7.05318868e-01 9.18281525e-02 7.96184599e-01]
[-5.43205580e-03 3.23292404e-01 6.34427786e-01 5.68508685e-01]]]
The output (the last list) seems to be an array of very small numbers, how do I get the result out of this?
Thanks
I solved this issue with the help of @daverim at github, where I had opened an issue. https://github.com/tensorflow/tensorflow/issues/34761. Here is the code to get useful data:
detection_boxes = interpreter.get_tensor(output_details[0]['index'])
detection_classes = interpreter.get_tensor(output_details[1]['index'])
detection_scores = interpreter.get_tensor(output_details[2]['index'])
num_boxes = interpreter.get_tensor(output_details[3]['index'])
print(num_boxes)
for i in range(int(num_boxes[0])):
if detection_scores[0, i] > .5:
class_id = detection_classes[0, i]
print(class_id)
Using the labelmap.txt file we can get the class name.