There are tutorials online showing how to quantize a .pb
TensorFlow model, see:
https://petewarden.com/2016/05/03/how-to-quantize-neural-networks-with-tensorflow/
What I am wondering is if there is a way to quantize the graph using python before saving the .pb
file with tf.train.write_graph()
In other words is there some function like quantize(graph_def)
that I can run to quantize the graph to 8bit weights and operations before I save it, saving me the hassle of having to do it via the command line after saving the file (like the tutorial linked above outlines).
You can use the quantize_weights and quantize_nodes rules for the Graph Transform Tool directly from Python. Here's an example: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/tools/graph_transforms/python/transform_graph_test.py#L76