I'm experiencing extremely long load times for TensorFlow graphs optimized with TensorRT. Non-optimized ones load quickly but loading optimized ones takes over 10 minutes by the very same code:
trt_graph_def = tf.GraphDef()
with tf.gfile.GFile(pb_path, 'rb') as pf:
trt_graph_def.ParseFromString(pf.read())
I'm on NVIDIA Drive PX 2 device (if that matters), with TensorFlow 1.12.0 built from sources, CUDA 9.2 and TensorRT 4.1.1. Due to the fact that it gets stuck on ParseFromString() I'm suspecting protobuf so here's its config:
$ dpkg -l | grep protobuf
ii libmirprotobuf3:arm64 0.26.3+16.04.20170605-0ubuntu1.1 arm64 Display server for Ubuntu - RPC definitions
ii libprotobuf-dev:arm64 2.6.1-1.3 arm64 protocol buffers C++ library (development files)
ii libprotobuf-lite9v5:arm64 2.6.1-1.3 arm64 protocol buffers C++ library (lite version)
ii libprotobuf9v5:arm64 2.6.1-1.3 arm64 protocol buffers C++ library
ii protobuf-compiler 2.6.1-1.3 arm64 compiler for protocol buffer definition files
$ pip3 freeze | grep protobuf
protobuf==3.6.1
And here's the way I convert non-optimized models to TRT ones:
def get_frozen_graph(graph_file):
"""Read Frozen Graph file from disk."""
with tf.gfile.FastGFile(graph_file, "rb") as f:
graph_def = tf.GraphDef()
graph_def.ParseFromString(f.read())
return graph_def
print("Load frozen graph from disk")
frozen_graph = get_frozen_graph(DATA_DIR + MODEL + '.pb')
print("Optimize the model with TensorRT")
trt_graph = trt.create_inference_graph(
input_graph_def=frozen_graph,
outputs=output_names,
max_batch_size=1,
max_workspace_size_bytes=1 << 26,
precision_mode='FP16',
minimum_segment_size=2
)
print("Write optimized model to the file")
with open(DATA_DIR + MODEL + '_fp16_trt.pb', 'wb') as f:
f.write(trt_graph.SerializeToString())
Tested on ssd_mobilenet_v1_coco, ssd_mobilenet_v2_coco and ssd_inception_v2_coco from the model zoo, all behave it the same way - downloaded pb file loads in seconds, TRT-optimized - well over 10 minutes. What's wrong? Has anyone experienced the same and has any hints how to fix it?
OK, I think I got it sorted out. I left protobuf 2.6.1 almost untouched, just installed 3.6.1 from sources with cpp implementation next to it and set the symlinks in a way that 3.6.1 is the default one. Now after:
export PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=cpp
all models load in a fraction of a second.
Here are the exact steps I made, for reference:
# Check current version
$ protoc --version
libprotoc 2.6.1
# Create a backup of the current config, just in case
mkdir protobuf
cd protobuf/
mkdir backup_originals
mkdir backup_originals/protoc
cp /usr/bin/protoc backup_originals/protoc/
tar cvzf backup_originals/libprotobuf.tgz /usr/lib/aarch64-linux-gnu/libprotobuf*
# Original include files located at: /usr/include/google/protobuf/
# I did not backed them up
# Original configuration of the libraries
$ ls -l /usr/lib/aarch64-linux-gnu/libprotobuf*
-rw-r--r-- 1 root root 2464506 Oct 24 2015 /usr/lib/aarch64-linux-gnu/libprotobuf.a
-rw-r--r-- 1 root root 430372 Oct 24 2015 /usr/lib/aarch64-linux-gnu/libprotobuf-lite.a
lrwxrwxrwx 1 root root 25 Oct 24 2015 /usr/lib/aarch64-linux-gnu/libprotobuf-lite.so -> libprotobuf-lite.so.9.0.1
lrwxrwxrwx 1 root root 25 Oct 24 2015 /usr/lib/aarch64-linux-gnu/libprotobuf-lite.so.9 -> libprotobuf-lite.so.9.0.1
-rw-r--r-- 1 root root 199096 Oct 24 2015 /usr/lib/aarch64-linux-gnu/libprotobuf-lite.so.9.0.1
lrwxrwxrwx 1 root root 20 Oct 24 2015 /usr/lib/aarch64-linux-gnu/libprotobuf.so -> libprotobuf.so.9.0.1
lrwxrwxrwx 1 root root 20 Oct 24 2015 /usr/lib/aarch64-linux-gnu/libprotobuf.so.9 -> libprotobuf.so.9.0.1
-rw-r--r-- 1 root root 1153872 Oct 24 2015 /usr/lib/aarch64-linux-gnu/libprotobuf.so.9.0.1
# Fetch and upack the sources of version 3.6.1
wget https://github.com/protocolbuffers/protobuf/releases/download/v3.6.1/protobuf-python-3.6.1.zip
wget https://github.com/protocolbuffers/protobuf/releases/download/v3.6.1/protoc-3.6.1-linux-aarch_64.zip
unzip protoc-3.6.1-linux-aarch_64.zip -d protoc-3.6.1
unzip protobuf-python-3.6.1.zip
# Update the protoc
sudo cp protoc-3.6.1/bin/protoc /usr/bin/protoc
$ protoc --version
libprotoc 3.6.1
# BUILD AND INSTALL THE LIBRARIES
export PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=cpp
cd protobuf-3.6.1/
./autogen.sh
./configure
make
make check
sudo make install
# Remove unnecessary links to the old version
sudo rm /usr/lib/aarch64-linux-gnu/libprotobuf.a
sudo rm /usr/lib/aarch64-linux-gnu/libprotobuf-lite.a
sudo rm /usr/lib/aarch64-linux-gnu/libprotobuf-lite.so
sudo rm /usr/lib/aarch64-linux-gnu/libprotobuf.so
# Move old version of the libraries to the same folder where the new ones have been installed, for clarity
sudo cp -d /usr/lib/aarch64-linux-gnu/libproto* /usr/local/lib/
sudo rm /usr/lib/aarch64-linux-gnu/libproto*
sudo ldconfig # Refresh shared library cache
# Check the updated version
$ protoc --version
libprotoc 3.6.1
# Final configuration of the libraries after the update
$ ls -l /usr/local/lib/libproto*
-rw-r--r-- 1 root root 77064022 Feb 9 11:07 /usr/local/lib/libprotobuf.a
-rwxr-xr-x 1 root root 978 Feb 9 11:07 /usr/local/lib/libprotobuf.la
-rw-r--r-- 1 root root 9396522 Feb 9 11:07 /usr/local/lib/libprotobuf-lite.a
-rwxr-xr-x 1 root root 1013 Feb 9 11:07 /usr/local/lib/libprotobuf-lite.la
lrwxrwxrwx 1 root root 26 Feb 9 11:07 /usr/local/lib/libprotobuf-lite.so -> libprotobuf-lite.so.17.0.0
lrwxrwxrwx 1 root root 26 Feb 9 11:07 /usr/local/lib/libprotobuf-lite.so.17 -> libprotobuf-lite.so.17.0.0
-rwxr-xr-x 1 root root 3722376 Feb 9 11:07 /usr/local/lib/libprotobuf-lite.so.17.0.0
lrwxrwxrwx 1 root root 25 Feb 9 11:19 /usr/local/lib/libprotobuf-lite.so.9 -> libprotobuf-lite.so.9.0.1
-rw-r--r-- 1 root root 199096 Feb 9 11:19 /usr/local/lib/libprotobuf-lite.so.9.0.1
lrwxrwxrwx 1 root root 21 Feb 9 11:07 /usr/local/lib/libprotobuf.so -> libprotobuf.so.17.0.0
lrwxrwxrwx 1 root root 21 Feb 9 11:07 /usr/local/lib/libprotobuf.so.17 -> libprotobuf.so.17.0.0
-rwxr-xr-x 1 root root 30029352 Feb 9 11:07 /usr/local/lib/libprotobuf.so.17.0.0
lrwxrwxrwx 1 root root 20 Feb 9 11:19 /usr/local/lib/libprotobuf.so.9 -> libprotobuf.so.9.0.1
-rw-r--r-- 1 root root 1153872 Feb 9 11:19 /usr/local/lib/libprotobuf.so.9.0.1
-rw-r--r-- 1 root root 99883696 Feb 9 11:07 /usr/local/lib/libprotoc.a
-rwxr-xr-x 1 root root 994 Feb 9 11:07 /usr/local/lib/libprotoc.la
lrwxrwxrwx 1 root root 19 Feb 9 11:07 /usr/local/lib/libprotoc.so -> libprotoc.so.17.0.0
lrwxrwxrwx 1 root root 19 Feb 9 11:07 /usr/local/lib/libprotoc.so.17 -> libprotoc.so.17.0.0
-rwxr-xr-x 1 root root 32645760 Feb 9 11:07 /usr/local/lib/libprotoc.so.17.0.0
lrwxrwxrwx 1 root root 18 Feb 9 11:19 /usr/local/lib/libprotoc.so.9 -> libprotoc.so.9.0.1
-rw-r--r-- 1 root root 991440 Feb 9 11:19 /usr/local/lib/libprotoc.so.9.0.1
# Reboot, just in case :)
sudo reboot
# BUILD AND INSTALL THE PYTHON-PROTOBUF MODULE
cd protobuf-3.6.1/python/
export PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=cpp
# Fix setup.py to force compilation with c++11 standard
vim setup.py
$ diff setup.py setup.py~
205,208c205,208
< #if v:
< # extra_compile_args.append('-std=c++11')
< #elif os.getenv('KOKORO_BUILD_NUMBER') or os.getenv('KOKORO_BUILD_ID'):
< extra_compile_args.append('-std=c++11')
---
> if v:
> extra_compile_args.append('-std=c++11')
> elif os.getenv('KOKORO_BUILD_NUMBER') or os.getenv('KOKORO_BUILD_ID'):
> extra_compile_args.append('-std=c++11')
# Build, test and install
python3 setup.py build --cpp_implementation
python3 setup.py test --cpp_implementation
sudo python3 setup.py install --cpp_implementation
# Make the cpp backend a default one when user logs in
sudo sh -c "echo 'export PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=cpp' >> /etc/profile.d/protobuf.sh"
I found that this update tends to break pip, so simply updated it with:
wget http://se.archive.ubuntu.com/ubuntu/pool/universe/p/python-pip/python3-pip_9.0.1-2_all.deb
wget http://se.archive.ubuntu.com/ubuntu/pool/universe/p/python-pip/python-pip-whl_9.0.1-2_all.deb
sudo dpkg -i *.deb