Google Coral Edge TPU compiled model - Inference always almost the same

I'm trying to get a Mobilenetv2 model (retrained last layers to my dataset) to run on the Google edge TPU Coral. I'm able to quantize and compile the model with the 'edgetpu_compiler' (followed this page But when I run inference in the TPU I'm getting a similar output for very different input images.

I've used 'tflite_convert' tool to quantize the model like this:

tflite_convert --output_file=./model.tflite 
--keras_model_file=models/MobileNet2_best-val-acc.h5 --output_format=TFLITE
--inference_type=QUANTIZED_UINT8 --default_ranges_min=0 --default_ranges_max=6 
--std_dev_values=127 --mean_values=128 --input_shapes=1,482,640,3 --input_arrays=input_2

Then I've used 'edgetpu_compiler' tool to compile it for the TPU:

sudo edgetpu_compiler  model.tflite
Edge TPU Compiler version 2.0.258810407
INFO: Initialized TensorFlow Lite runtime.

Model compiled successfully in 557 ms.

Input model: model.tflite
Input size: 3.44MiB
Output model: model_edgetpu.tflite
Output size: 4.16MiB
On-chip memory available for caching model parameters: 4.25MiB
On-chip memory used for caching model parameters: 3.81MiB
Off-chip memory used for streaming uncached model parameters: 0.00B
Number of Edge TPU subgraphs: 1
Total number of operations: 71
Operation log: model_edgetpu.log
See the operation log file for individual operation details.

Then when I run inference using this code:

labels = ["Class1", "Class2", "Class3", "Class4"]
results = engine.ClassifyWithImage(img, top_k=4)
for result in results:
    print('Score : ', result[1])

The output is like this (assuming labels ["Class1", "Class2", "Class3", "Class4"]):

Score :  0.2890625
Score :  0.26953125
Score :  0.21875
Score :  0.21875

It is almost the same for any input image, and usually, the first two classes have the same (or very very similar) value (the same for the 3rd and 4th) as seen in the example shown above. It should be 0.99 for one class (as it is in the .h5 model or even in the .tflite model without quantization)

Can it be something with the parameters -default_ranges_min=0 --default_ranges_max=6 --std_dev_values=127 --mean_values=128? How can I calculate them?

Edit 1:

Using the answer from this post I've tried to quantize the model using both --std_dev_values=127 --mean_values=128 and --std_dev_values=255 --mean_values=0, but I'm still getting garbage inference. As mobilenet2 uses relu6, default ranges shoud be -default_ranges_min=0 --default_ranges_max=6 right?

The model is a MobileNetv2 retrained, the input is an RGB image (3 channels), the input shape is 1,482,640,3.


  • From your comment on mobilenetv1, it sounds like you are taking a retrained float model and converting it to TFLite. You intended to quantize it by running the command that you listed.

    I'd recommend that you take a closer look at the TensorFlow lite docs. In general, there are two ways of quantization (doing it during training time and doing it post-training). The approach you seem to want to take is post-training.

    The proper way of doing it post-training for something like Coral is to follow this guide (, as recommended by the Coral team here (

    The flow you're using above is more geared towards training time quantization.