I followed a great tutorial on deploying a TensorFlow model using TF-Lite and everything works. However, when I try to use my own model (converted from saved keras
model) I get the following error when calling the allocate_tensors()
method:
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-73-6b4d64de8090> in <module>
1 #interpreter = tflite.Interpreter(model_path='model.tflite')
2 interpreter = tflite.Interpreter(model_path=lite_model_location)
----> 3 interpreter.allocate_tensors()
~/pyenv/srcnn/lib/python3.6/site-packages/tflite_runtime/interpreter.py in allocate_tensors(self)
257 def allocate_tensors(self):
258 self._ensure_safe()
--> 259 return self._interpreter.AllocateTensors()
260
261 def _safe_to_run(self):
RuntimeError: external/org_tensorflow/tensorflow/lite/core/subgraph.cc BytesRequired number of elements overflowed.
Node number 0 (CONV_2D) failed to prepare.
I believe it has to do with the way I've converted my model, but none of the options described in the tf.lite.TFLiteConverter
are helping.
The tflite
model I'm trying to load can be found here, which is a converted version of the saved keras model found here.
The model from the tutorial works without issue. I've noticed differences in the input details between the tflite
versions of these models. For example, the tutorial model (working):
{'name': 'input',
'index': 88,
'shape': array([ 1, 224, 224, 3], dtype=int32),
'shape_signature': array([ 1, 224, 224, 3], dtype=int32),
'dtype': <class 'numpy.uint8'>,
'quantization': (0.0078125, 128),
'quantization_parameters': {'scales': array([0.0078125], dtype=float32),
'zero_points': array([128], dtype=int32),
'quantized_dimension': 0},
'sparsity_parameters': {}}
While the input details for my non-working tflite
model are:
{'name': 'input_1',
'index': 0,
'shape': array([1, 1, 1, 3], dtype=int32),
'shape_signature': array([-1, -1, -1, 3], dtype=int32),
'dtype': <class 'numpy.float32'>,
'quantization': (0.0, 0),
'quantization_parameters': {'scales': array([], dtype=float32),
'zero_points': array([], dtype=int32),
'quantized_dimension': 0},
'sparsity_parameters': {}}
Could it be something with the conversion? The model worked fine in development using keras
, and should be able to accept inputs of variable x- and y-dimensions (image sizes). I don't think dtypes are the issues here since uint8
and float32
should both be supported according to the documentation.
Ok, pretty easy fix it turns out. When using a CNN with unknown input dimensions (i.e. -1
in the shape_signature
here, caused by setting -1
in the input layer) the unknown dimensions in the input tensor are set to 1
. To get the model to allocate properly when using a model like this, you have to do 2 things:
interpreter.resize_tensor_input(0, [1, input_shape[0], input_shape[1], 3], strict=True)
.dtype
of the input data to match that of the model's input layer, seen in the 'dtype'
entry in the input details.It seems this is done automatically in regular TensorFlow, but the model must be prepared like this in the Lite version.
Edit: Regarding setting the dtype
of the input data, this is done in the casting to numpy.array
after it is imported from an image format, before calling allocate_tensors()
. You can see the difference between the TF implementation (line 332) and the TFLite implmentation (line 77).