I am using tensorRT API to optimize the U-NET model which is built using keras. The result after optimization are not good enough , so i am thinking of making the same model in tensorflow as Keras is high-end API and maybe it's inference is slow. So my question is did building the same model in tensorflow will improve the inference as compare to keras model. And did tensorrt optimize tensorflow model better than keras.
I did some research but didn't find anything regarding inference speed of the same model in tensorflow and keras.
As far as I tested, there was no significant difference (maybe a tiny tiny overhead for Keras).
The better inference time that you expect will not be obtained by switching from keras to tensorflow. I have worked with TensorRT and most of the problems come from the fact that not all layers are supported(for the conversion/optimization).
Ensure that everything that the entire pipeline Keras Model -- TensorFlow Model -- Layer Optimization -- TensorRT is done with the same version of tensorflow. I would recommend to train the model via tensorflow.keras
instead of simple keras
.
Also, make sure that you convert with the right FP operations. (FP32/FP16/INT8). The biggest gain in inference speed would be if you converted from standard (FP32) to INT8. In my experience, the conversion from FP32 to FP16 will not speed up significantly.
Semantic Segmentation is the most computationally expensive task, so don't expect to have a very fast inference model deployed on TX2 for example (with TensorRT).