python tensorflow keras tensorflow-lite google-coral

Error compiling with Edgetpu compiler for Tensorflow

I am trying to convert a two output keras model to a compiled, quantized, tflite model that will work on a Google Coral. I have used this exact process before with a Keras network with only 1 output and it works.

Here is my process:

import tensorflow as tf
from tensorflow.keras.applications.mobilenet import preprocess_input

file = 'path/to/model-01.h5'
model = tf.keras.models.load_model(file)
converter = tf.lite.TFLiteConverter.from_keras_model(model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]

os.chdir('/path/to/image/directories')#Where image directories are
directory = os.listdir()
directory

def representative_dataset_gen():
    
    for i in directory:
        count = 0
        os.chdir(i)
        files = os.listdir()
        print(i)
        for j in files:
            if count<500:
                img = Image.open(j)
                width, height = img.size
                bands = img.getbands()
                array = np.asarray(img, dtype=np.float32)
                array = preprocess_input(array)
                count=count+1
                yield[np.expand_dims(array, axis=0)]
            else:
                break
        os.chdir('../')

converter.representative_dataset = representative_dataset_gen
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.inference_input_type = tf.int8  # or tf.uint8
converter.inference_output_type = tf.int8  # or tf.uint8
tflite_quant_model = converter.convert()

tflite_model_dir = pathlib.Path('where/i/want/to/save/')
tflite_quant_model_file = tflite_model_dir/'quantized.tflite'
tflite_quant_model_file.write_bytes(tflite_quant_model)

Then I attempt to use the edgetpu_compiler in the terminal

edgetpu_compiler quantizedmodel.tflite

And receive this error:

ERROR: :129 std::abs(input_product_scale - bias_scale) <= 1e-6 * std::min(input_product_scale, bias_scale) was not true.
ERROR: Node number 40 (FULLY_CONNECTED) failed to prepare.


Internal compiler error. Aborting!

I also get the same error when trying to interpreter.allocate_tensors() when trying to validate the model.

#Load Model
interpreter = tf.lite.Interpreter(model_path='path/to/model/quantized.tflite')
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()
interpreter.resize_tensor_input(input_details[0]['index'], (32, 200, 200, 3))
interpreter.resize_tensor_input(output_details[0]['index'], (32, 5))
interpreter.allocate_tensors()

It returns

RuntimeError                              Traceback (most recent call last)
 in 
      2 interpreter.resize_tensor_input(input_details[0]['index'], (32, 200, 200, 3))
      3 interpreter.resize_tensor_input(output_details[0]['index'], (32, 5))
----> 4 interpreter.allocate_tensors()
      5 

~/Software/anaconda3/envs/Tensorflow2/lib/python3.7/site-packages/tensorflow_core/lite/python/interpreter.py in allocate_tensors(self)
    245   def allocate_tensors(self):
    246     self._ensure_safe()
--> 247     return self._interpreter.AllocateTensors()
    248 
    249   def _safe_to_run(self):

~/Software/anaconda3/envs/Tensorflow2/lib/python3.7/site-packages/tensorflow_core/lite/python/interpreter_wrapper/tensorflow_wrap_interpreter_wrapper.py in AllocateTensors(self)
    108 
    109     def AllocateTensors(self):
--> 110         return _tensorflow_wrap_interpreter_wrapper.InterpreterWrapper_AllocateTensors(self)
    111 
    112     def Invoke(self):

RuntimeError: tensorflow/lite/kernels/kernel_util.cc:106 std::abs(input_product_scale - bias_scale) <= 1e-6 * std::min(input_product_scale, bias_scale) was not true.Node number 40 (FULLY_CONNECTED) failed to prepare.

I am using tensorflow 2.2.0

Solution

I would open an issue here for this one since it is an actual bug during tflite quantization. I'm more than positive that I've seen this before, but not sure if there has been a fix for it :/

[edit] Basically, you can try using this script for a dummy inference run, if that fails on your CPU model, then clearly the model was broken after tflite conversion.

import numpy as np 
import sys
from tflite_runtime.interpreter import Interpreter
from tflite_runtime.interpreter import load_delegate

if len(sys.argv) < 2:
    print('Usage:', sys.argv[0], 'model_path')
    exit()

def main():
    """Runs inference with an input tflite model.""" 
    model_path = str(sys.argv[1])
    if model_path.endswith('edgetpu.tflite'):
        print('initialized for edgetpu')
        delegates = [load_delegate('libedgetpu.so.1.0')]                             
        interpreter = Interpreter(model_path, experimental_delegates=delegates)
    else: 
        print('initialized for cpu')
        interpreter = Interpreter(model_path)

    interpreter.allocate_tensors() 
    input_details = interpreter.get_input_details() 
    images = np.zeros(input_details[0]['shape'], input_details[0]['dtype'])
    #print(images)
    interpreter.set_tensor(input_details[0]['index'], images) 
    interpreter.invoke() 
    output_details = interpreter.get_output_details() 
    outputs = interpreter.get_tensor(output_details[0]['index']) 
    print(outputs)
    print('Success.') 

if __name__== '__main__':
    main()

I've seen this problem in the pass, but not sure if there has been resolution. Opening a bug is actually the best way to get this fixed.