Search code examples
pythontensorflowprotocol-bufferstensorrt

Extremely long time (over 10 minutes) to load TensorRT-optimized TensorFlow graphs from .pb file


I'm experiencing extremely long load times for TensorFlow graphs optimized with TensorRT. Non-optimized ones load quickly but loading optimized ones takes over 10 minutes by the very same code:

trt_graph_def = tf.GraphDef()
with tf.gfile.GFile(pb_path, 'rb') as pf:
   trt_graph_def.ParseFromString(pf.read())

I'm on NVIDIA Drive PX 2 device (if that matters), with TensorFlow 1.12.0 built from sources, CUDA 9.2 and TensorRT 4.1.1. Due to the fact that it gets stuck on ParseFromString() I'm suspecting protobuf so here's its config:

$ dpkg -l | grep protobuf
ii libmirprotobuf3:arm64 0.26.3+16.04.20170605-0ubuntu1.1 arm64 Display server for Ubuntu - RPC definitions
ii libprotobuf-dev:arm64 2.6.1-1.3 arm64 protocol buffers C++ library (development files)
ii libprotobuf-lite9v5:arm64 2.6.1-1.3 arm64 protocol buffers C++ library (lite version)
ii libprotobuf9v5:arm64 2.6.1-1.3 arm64 protocol buffers C++ library
ii protobuf-compiler 2.6.1-1.3 arm64 compiler for protocol buffer definition files

$ pip3 freeze | grep protobuf
protobuf==3.6.1

And here's the way I convert non-optimized models to TRT ones:

def get_frozen_graph(graph_file):
  """Read Frozen Graph file from disk."""
  with tf.gfile.FastGFile(graph_file, "rb") as f:
    graph_def = tf.GraphDef()
    graph_def.ParseFromString(f.read())
  return graph_def

print("Load frozen graph from disk")

frozen_graph = get_frozen_graph(DATA_DIR + MODEL + '.pb')

print("Optimize the model with TensorRT")

trt_graph = trt.create_inference_graph(
    input_graph_def=frozen_graph,
    outputs=output_names,
    max_batch_size=1,
    max_workspace_size_bytes=1 << 26,
    precision_mode='FP16',
    minimum_segment_size=2
)

print("Write optimized model to the file")
with open(DATA_DIR + MODEL + '_fp16_trt.pb', 'wb') as f:
    f.write(trt_graph.SerializeToString())

Tested on ssd_mobilenet_v1_coco, ssd_mobilenet_v2_coco and ssd_inception_v2_coco from the model zoo, all behave it the same way - downloaded pb file loads in seconds, TRT-optimized - well over 10 minutes. What's wrong? Has anyone experienced the same and has any hints how to fix it?


Solution

  • OK, I think I got it sorted out. I left protobuf 2.6.1 almost untouched, just installed 3.6.1 from sources with cpp implementation next to it and set the symlinks in a way that 3.6.1 is the default one. Now after:

    export PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=cpp
    

    all models load in a fraction of a second.

    Here are the exact steps I made, for reference:

    # Check current version
    $ protoc --version
    libprotoc 2.6.1
    
    # Create a backup of the current config, just in case
    mkdir protobuf
    cd protobuf/
    mkdir backup_originals
    mkdir backup_originals/protoc
    cp /usr/bin/protoc backup_originals/protoc/
    tar cvzf backup_originals/libprotobuf.tgz /usr/lib/aarch64-linux-gnu/libprotobuf*
    # Original include files located at: /usr/include/google/protobuf/
    # I did not backed them up
    
    # Original configuration of the libraries
    $ ls -l /usr/lib/aarch64-linux-gnu/libprotobuf*
    -rw-r--r-- 1 root root 2464506 Oct 24  2015 /usr/lib/aarch64-linux-gnu/libprotobuf.a
    -rw-r--r-- 1 root root  430372 Oct 24  2015 /usr/lib/aarch64-linux-gnu/libprotobuf-lite.a
    lrwxrwxrwx 1 root root      25 Oct 24  2015 /usr/lib/aarch64-linux-gnu/libprotobuf-lite.so -> libprotobuf-lite.so.9.0.1
    lrwxrwxrwx 1 root root      25 Oct 24  2015 /usr/lib/aarch64-linux-gnu/libprotobuf-lite.so.9 -> libprotobuf-lite.so.9.0.1
    -rw-r--r-- 1 root root  199096 Oct 24  2015 /usr/lib/aarch64-linux-gnu/libprotobuf-lite.so.9.0.1
    lrwxrwxrwx 1 root root      20 Oct 24  2015 /usr/lib/aarch64-linux-gnu/libprotobuf.so -> libprotobuf.so.9.0.1
    lrwxrwxrwx 1 root root      20 Oct 24  2015 /usr/lib/aarch64-linux-gnu/libprotobuf.so.9 -> libprotobuf.so.9.0.1
    -rw-r--r-- 1 root root 1153872 Oct 24  2015 /usr/lib/aarch64-linux-gnu/libprotobuf.so.9.0.1
    
    # Fetch and upack the sources of version 3.6.1
    wget https://github.com/protocolbuffers/protobuf/releases/download/v3.6.1/protobuf-python-3.6.1.zip
    wget https://github.com/protocolbuffers/protobuf/releases/download/v3.6.1/protoc-3.6.1-linux-aarch_64.zip
    unzip protoc-3.6.1-linux-aarch_64.zip -d protoc-3.6.1
    unzip protobuf-python-3.6.1.zip
    
    # Update the protoc
    sudo cp protoc-3.6.1/bin/protoc /usr/bin/protoc
    
    $ protoc --version
    libprotoc 3.6.1
    
    # BUILD AND INSTALL THE LIBRARIES
    export PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=cpp
    cd protobuf-3.6.1/
    ./autogen.sh
    ./configure
    make
    make check
    sudo make install
    
    # Remove unnecessary links to the old version
    sudo rm /usr/lib/aarch64-linux-gnu/libprotobuf.a
    sudo rm /usr/lib/aarch64-linux-gnu/libprotobuf-lite.a
    sudo rm /usr/lib/aarch64-linux-gnu/libprotobuf-lite.so
    sudo rm /usr/lib/aarch64-linux-gnu/libprotobuf.so
    
    # Move old version of the libraries to the same folder where the new ones have been installed, for clarity
    sudo cp -d /usr/lib/aarch64-linux-gnu/libproto* /usr/local/lib/
    sudo rm /usr/lib/aarch64-linux-gnu/libproto*
    
    
    sudo ldconfig # Refresh shared library cache   
    
    
    # Check the updated version
    $ protoc --version
    libprotoc 3.6.1
    
    
    # Final configuration of the libraries after the update
    $ ls -l /usr/local/lib/libproto*
    -rw-r--r-- 1 root root 77064022 Feb  9 11:07 /usr/local/lib/libprotobuf.a
    -rwxr-xr-x 1 root root      978 Feb  9 11:07 /usr/local/lib/libprotobuf.la
    -rw-r--r-- 1 root root  9396522 Feb  9 11:07 /usr/local/lib/libprotobuf-lite.a
    -rwxr-xr-x 1 root root     1013 Feb  9 11:07 /usr/local/lib/libprotobuf-lite.la
    lrwxrwxrwx 1 root root       26 Feb  9 11:07 /usr/local/lib/libprotobuf-lite.so -> libprotobuf-lite.so.17.0.0
    lrwxrwxrwx 1 root root       26 Feb  9 11:07 /usr/local/lib/libprotobuf-lite.so.17 -> libprotobuf-lite.so.17.0.0
    -rwxr-xr-x 1 root root  3722376 Feb  9 11:07 /usr/local/lib/libprotobuf-lite.so.17.0.0
    lrwxrwxrwx 1 root root       25 Feb  9 11:19 /usr/local/lib/libprotobuf-lite.so.9 -> libprotobuf-lite.so.9.0.1
    -rw-r--r-- 1 root root   199096 Feb  9 11:19 /usr/local/lib/libprotobuf-lite.so.9.0.1
    lrwxrwxrwx 1 root root       21 Feb  9 11:07 /usr/local/lib/libprotobuf.so -> libprotobuf.so.17.0.0
    lrwxrwxrwx 1 root root       21 Feb  9 11:07 /usr/local/lib/libprotobuf.so.17 -> libprotobuf.so.17.0.0
    -rwxr-xr-x 1 root root 30029352 Feb  9 11:07 /usr/local/lib/libprotobuf.so.17.0.0
    lrwxrwxrwx 1 root root       20 Feb  9 11:19 /usr/local/lib/libprotobuf.so.9 -> libprotobuf.so.9.0.1
    -rw-r--r-- 1 root root  1153872 Feb  9 11:19 /usr/local/lib/libprotobuf.so.9.0.1
    -rw-r--r-- 1 root root 99883696 Feb  9 11:07 /usr/local/lib/libprotoc.a
    -rwxr-xr-x 1 root root      994 Feb  9 11:07 /usr/local/lib/libprotoc.la
    lrwxrwxrwx 1 root root       19 Feb  9 11:07 /usr/local/lib/libprotoc.so -> libprotoc.so.17.0.0
    lrwxrwxrwx 1 root root       19 Feb  9 11:07 /usr/local/lib/libprotoc.so.17 -> libprotoc.so.17.0.0
    -rwxr-xr-x 1 root root 32645760 Feb  9 11:07 /usr/local/lib/libprotoc.so.17.0.0
    lrwxrwxrwx 1 root root       18 Feb  9 11:19 /usr/local/lib/libprotoc.so.9 -> libprotoc.so.9.0.1
    -rw-r--r-- 1 root root   991440 Feb  9 11:19 /usr/local/lib/libprotoc.so.9.0.1
    
    # Reboot, just in case :)
    sudo reboot
    
    # BUILD AND INSTALL THE PYTHON-PROTOBUF MODULE
    cd protobuf-3.6.1/python/
    export PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=cpp
    
    
    # Fix setup.py to force compilation with c++11 standard
    vim setup.py
    
    $ diff setup.py setup.py~
    205,208c205,208
    <     #if v:
    <     #  extra_compile_args.append('-std=c++11')
    <     #elif os.getenv('KOKORO_BUILD_NUMBER') or os.getenv('KOKORO_BUILD_ID'):
    <     extra_compile_args.append('-std=c++11')
    ---
    >     if v:
    >       extra_compile_args.append('-std=c++11')
    >     elif os.getenv('KOKORO_BUILD_NUMBER') or os.getenv('KOKORO_BUILD_ID'):
    >       extra_compile_args.append('-std=c++11')
    
    # Build, test and install
    python3 setup.py build --cpp_implementation
    python3 setup.py test --cpp_implementation
    sudo python3 setup.py install --cpp_implementation
    
    # Make the cpp backend a default one when user logs in
    sudo sh -c "echo 'export PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=cpp' >> /etc/profile.d/protobuf.sh"
    

    I found that this update tends to break pip, so simply updated it with:

    wget http://se.archive.ubuntu.com/ubuntu/pool/universe/p/python-pip/python3-pip_9.0.1-2_all.deb
    wget http://se.archive.ubuntu.com/ubuntu/pool/universe/p/python-pip/python-pip-whl_9.0.1-2_all.deb
    sudo dpkg -i *.deb