What is the best way to deploy a tensorflow trained graph into production?

I have been working on machine learning problems lately as part of my internship. So far I have been using Tensorflow with python because that's what I am most comfortable with. Once a problem is solved using deep learning, I am left with the architecture of the network and the weights. Now, my problem is, how can I deploy my solution in production? I won't be using tensorflow serving because it is mainly for huge applications where you set a remote server and your developed application will make requests to this server. In my case, I just want to develop a machine learning solution and integrate it into an already existing software that uses C++ with visual studio 2017.

So far and after a lot of research, I have some solutions in mind :

1) Using the "dnn" module from OpenCV : this module can load graphs and you can do inference and other operations (like extracting a specific layer from the network at run time). This module seemed very promising but then I started facing some problems when using networks that are a little bit different from the one used in the example described in OpenCV github, they used "inception5h" for the example and when I tried to load "inception_v3" there was an error about some unknown layer in the network, namely the JPEG_decode layer.

2) Building tensorflow from source and using it directly with C++. This solution seemed like the best one but then I encountered so many problems with parts of my code not compiling while others do. I am using Visual Studio 2017 with Windows 10. So although I was able to build tensorflow from source, I wasn't able to compile all parts of my code, in fact it wasn't even my code, it was an example from tensorflow website, this one : tensorflow C++ example.

3) Another possibility that I am entertaining is using tensorflow for designing the solution and then using another machine learning framework such as Caffe2, CNTK...etc for deployment into production. I have found some possibilities to convert graphs from one framework to another here : models converters. I thought that this could be a reasonable solution because all I have to do is find the framework most compatible with windows and just do a model conversion once I finish designing my solution in tensorflow and python. The conversion process though seems a little too good, am I wrong?

4) A final possibility that I am thinking of is using CPython. So basically, I will create my the pipeline for prediction in python, wrap in some python functions then use <Python.h> in my Visual Studio project and make calls to those functions using C++, here's an example : embedding python in C++. I have never used a solution like this before and I am not sure about all the things that could go wrong.

So basically, what do you think is the best solution to deploy a machine learning solution into an already existing project on Visual Studio that uses C++? Am I missing a better solution? Any guidelines or hints are greatly appreciated!

Solution

I ended up using solution 2. After the new updates from tensorflow, it's now easier to build tensorflow from source on Windows. With this solution, I didn't need to worry about the compatibility of my models since I use tensorflow with python for prototyping and I use it with C++ for production.

[EDIT] : In 2021, I am now using ONNX Runtime (ORT) for deploying my models in production as part of a C++ application. The documentation for ORT is not great but the tool itself is very good.