The following code snippet creates an 8-bit quantized TF Lite model, and replacing QUANTIZED_UINT8
with FLOAT
creates a 32-bit model. Is there any flag that creates a 16-bit quantized model? I've searched the TF Lite documentation but I couldn't find any documentation on the list of possible flags. Does anyone know how to do this?
~/tensorflow/bazel-bin/tensorflow/contrib/lite/toco/toco \
--input_file=$(pwd)/model.pb \
--input_format=TENSORFLOW_GRAPHDEF \
--output_format=TFLITE \
--output_file=$(pwd)/model.lite --inference_type=QUANTIZED_UINT8 \
--input_type=QUANTIZED_UINT8 --input_arrays=conv2d_1_input \
--default_ranges_min=0.0 --default_ranges_max=1.0 \
--output_arrays=average_pooling2d_2/AvgPool --input_shapes=1024,32,32,2
Currently, the only quantized type that TFLite supports in 8 bits. See here: https://github.com/tensorflow/tensorflow/blob/54b62eed204fbc4e155fbf934bee9b438bb391ef/tensorflow/lite/toco/types.proto#L27
This is because, for existing quantized models, 8 bits was found sufficient, but this may change. If you have a model that needs more bits for quantization, it may be worthwhile to create a TensorFlow issue describing your use case.