I am trying to convert a two output keras model to a compiled, quantized, tflite model that will work on a Google Coral. I have used this exact process before with a Keras network with only 1 output and it works.
Here is my process:
import tensorflow as tf
from tensorflow.keras.applications.mobilenet import preprocess_input
file = 'path/to/model-01.h5'
model = tf.keras.models.load_model(file)
converter = tf.lite.TFLiteConverter.from_keras_model(model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
os.chdir('/path/to/image/directories')#Where image directories are
directory = os.listdir()
directory
def representative_dataset_gen():
for i in directory:
count = 0
os.chdir(i)
files = os.listdir()
print(i)
for j in files:
if count<500:
img = Image.open(j)
width, height = img.size
bands = img.getbands()
array = np.asarray(img, dtype=np.float32)
array = preprocess_input(array)
count=count+1
yield[np.expand_dims(array, axis=0)]
else:
break
os.chdir('../')
converter.representative_dataset = representative_dataset_gen
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.inference_input_type = tf.int8 # or tf.uint8
converter.inference_output_type = tf.int8 # or tf.uint8
tflite_quant_model = converter.convert()
tflite_model_dir = pathlib.Path('where/i/want/to/save/')
tflite_quant_model_file = tflite_model_dir/'quantized.tflite'
tflite_quant_model_file.write_bytes(tflite_quant_model)
Then I attempt to use the edgetpu_compiler in the terminal
edgetpu_compiler quantizedmodel.tflite
And receive this error:
ERROR: :129 std::abs(input_product_scale - bias_scale) <= 1e-6 * std::min(input_product_scale, bias_scale) was not true.
ERROR: Node number 40 (FULLY_CONNECTED) failed to prepare.
Internal compiler error. Aborting!
I also get the same error when trying to interpreter.allocate_tensors()
when trying to validate the model.
#Load Model
interpreter = tf.lite.Interpreter(model_path='path/to/model/quantized.tflite')
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()
interpreter.resize_tensor_input(input_details[0]['index'], (32, 200, 200, 3))
interpreter.resize_tensor_input(output_details[0]['index'], (32, 5))
interpreter.allocate_tensors()
It returns
RuntimeError Traceback (most recent call last)
in
2 interpreter.resize_tensor_input(input_details[0]['index'], (32, 200, 200, 3))
3 interpreter.resize_tensor_input(output_details[0]['index'], (32, 5))
----> 4 interpreter.allocate_tensors()
5
~/Software/anaconda3/envs/Tensorflow2/lib/python3.7/site-packages/tensorflow_core/lite/python/interpreter.py in allocate_tensors(self)
245 def allocate_tensors(self):
246 self._ensure_safe()
--> 247 return self._interpreter.AllocateTensors()
248
249 def _safe_to_run(self):
~/Software/anaconda3/envs/Tensorflow2/lib/python3.7/site-packages/tensorflow_core/lite/python/interpreter_wrapper/tensorflow_wrap_interpreter_wrapper.py in AllocateTensors(self)
108
109 def AllocateTensors(self):
--> 110 return _tensorflow_wrap_interpreter_wrapper.InterpreterWrapper_AllocateTensors(self)
111
112 def Invoke(self):
RuntimeError: tensorflow/lite/kernels/kernel_util.cc:106 std::abs(input_product_scale - bias_scale) <= 1e-6 * std::min(input_product_scale, bias_scale) was not true.Node number 40 (FULLY_CONNECTED) failed to prepare.
I am using tensorflow 2.2.0
I would open an issue here for this one since it is an actual bug during tflite quantization. I'm more than positive that I've seen this before, but not sure if there has been a fix for it :/
[edit] Basically, you can try using this script for a dummy inference run, if that fails on your CPU model, then clearly the model was broken after tflite conversion.
import numpy as np
import sys
from tflite_runtime.interpreter import Interpreter
from tflite_runtime.interpreter import load_delegate
if len(sys.argv) < 2:
print('Usage:', sys.argv[0], 'model_path')
exit()
def main():
"""Runs inference with an input tflite model."""
model_path = str(sys.argv[1])
if model_path.endswith('edgetpu.tflite'):
print('initialized for edgetpu')
delegates = [load_delegate('libedgetpu.so.1.0')]
interpreter = Interpreter(model_path, experimental_delegates=delegates)
else:
print('initialized for cpu')
interpreter = Interpreter(model_path)
interpreter.allocate_tensors()
input_details = interpreter.get_input_details()
images = np.zeros(input_details[0]['shape'], input_details[0]['dtype'])
#print(images)
interpreter.set_tensor(input_details[0]['index'], images)
interpreter.invoke()
output_details = interpreter.get_output_details()
outputs = interpreter.get_tensor(output_details[0]['index'])
print(outputs)
print('Success.')
if __name__== '__main__':
main()
I've seen this problem in the pass, but not sure if there has been resolution. Opening a bug is actually the best way to get this fixed.