I have a pre-trained PyTorch model that I want to convert to TFlite. The model is from the seisbench API. I have used the code below for the conversion. The code has some checks to confirm that the various format conversions worked.
I have followed the flow .pt -> .onnx -> tensorflow -> tflite, but I obtain an .onnx file which is smaller (98 kB) than the final tflite model (108 kB). I am using the onnx-tensorflow library to convert the .onnx file to tensorflow (https://github.com/onnx/onnx-tensorflow)
model = sbm.PhaseNet.from_pretrained("instance") #load the model from the seisbench api
#model.load_state_dict(pNET.state_dict())
print("Model's state_dict:")
for param_tensor in model.state_dict():
print(param_tensor, "\t", model.state_dict()[param_tensor].size())
# Save model information
print(model.get_model_args())
input_lenght = model.in_samples
input_depth = model.in_channels
# save to .pt
model.eval() #turn off gradient computations and other training-only operations
torch.save(model, 'pNET.pt')
# check if the model has been saved correctly
temp_model = torch.load('pNET.pt')
temp_model.eval()
print("Model's state_dict:")
for param_tensor in temp_model.state_dict():
print(param_tensor, "\t", temp_model.state_dict()[param_tensor].size())
# save to .onnx
# define an input vector (random vector)
sample_input = torch.randn(1, input_depth, input_lenght, requires_grad=True) #order is width, depth, lenght of input
#width fixed to 1 for time series data
# export
torch.onnx.export(
model, # PyTorch Model
sample_input, # Input tensor
'pNET.onnx', # Output file name
input_names=['input'], # Input tensor name (arbitrary)
output_names=['output'] # Output tensor name (arbitrary)
)
# check if the model has been saved correctly
onnx_model = onnx.load('pNET.onnx')
# Check that the IR is well formed
onnx.checker.check_model(onnx_model)
# Print a Human readable representation of the graph
onnx.helper.printable_graph(onnx_model.graph)
# Try to run an inference with the newly saved onnx model
import onnxruntime as ort
import numpy as np
ort_session = ort.InferenceSession('pNET.onnx')
outputs = ort_session.run(
None,
{'input': np.random.randn(1, input_depth, input_lenght).astype(np.float32)} #random input
)
print(outputs) #check if you get a tensor of the right shape
print(output_data.shape)
from onnx_tf.backend import prepare
# Converting to TensorFlow model
onnx_model = onnx.load("pNET.onnx") # load onnx model
tf_rep = prepare(onnx_model) # prepare tf representation
tf_rep.export_graph("pNET") # export the model
# Check if the conversion worked
# Run a TF inference
import tensorflow as tf
model = tf.saved_model.load("./pNET")
model.trainable = False
input_tensor = tf.random.uniform([1, input_depth, input_lenght])
out = model(**{'input': input_tensor})
print(out) #check if you get a tensor of the right shape
print(output_data.shape)
# float16 quantization
converter = tf.lite.TFLiteConverter.from_saved_model("./pNET")
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.target_spec.supported_types = [tf.float16]
tflite_quant_model = converter.convert()
# Save the model
with open('pNETlite16float.tflite', 'wb') as f:
f.write(tflite_model) # same size as when I use interpreter instead of converter?
My confusion stems from the fact that I was expecting post-training quantization to reduce model size. Does TFLite add some extra wrappers or methods to a model, increasing the size compared to .onnx?
We now support an official and direct conversion from PyTorch to TF lite. You can give that a try: https://github.com/google-ai-edge/ai-edge-torch