c++tensorflow machine-learning deep-learning multi-gpu

Use multiple gpus on the C++ version of tensorflow

First explain my operating environment:

win10x64  
cuda9.1  and cudnn7  
gtx1080Ti x2  
i7-6850k

I used the c++ version of tensorflow to write a program that reads the pb file and then enters the image for prediction. My goal is that all gpus can be called when using tensorflow in one thread or one threads one gpu.

First use the python under windows to call the tensorflow slim training, and then convert the saved model file to freeze file using freeze_graph.py.

However, I found that only one gpu was called when using the session->Run() function. Whether it was creating multiple threads or one thread, I used the following method to call multiple gpu:

tensorflow::graph::SetDefaultDevice("0", &graphdef);

GraphDef graphdef; //Graph Definition for current model
Status status_load = ReadBinaryProto(Env::Default(), model_path, &graphdef); //read graph from pb_file
if (!status_load.ok()) {
        std::cout << " ERROR: Loading model failed...\n"
        << model_path
        << std::endl;
    std::cout << status_load.ToString() << "\n";
    system("pause");
    return;
}
tensorflow::SessionOptions options;
tensorflow::ConfigProto &config = options.config;
config.set_log_device_placement(true);
config.mutable_gpu_options()->set_allow_growth(true);
//config.mutable_gpu_options()->set_allocator_type(std::string("BFC"));
//config.mutable_gpu_options()->set_visible_device_list("");//this no error,but still can only call one gpu
//config.mutable_gpu_options()->set_visible_device_list("0");//error!
config.mutable_gpu_options()->set_visible_device_list("0,1");//error!
config.mutable_gpu_options()->set_per_process_gpu_memory_fraction(1);

Session* session;
Status status = NewSession(SessionOptions(options), &session);
Status status_create = session->Create(graphdef);

Both of the above methods have failed, and the error tips are the same:

2018-08-08 09:25:55.953495: I D:\MyProject\tensorflow-1.10.0-rc1\tensorflow\core\platform\cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX AVX2
2018-08-08 09:25:56.541237: I D:\MyProject\tensorflow-1.10.0-rc1\tensorflow\core\common_runtime\gpu\gpu_device.cc:1404] Found device 0 with properties:
name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.582
pciBusID: 0000:06:00.0
totalMemory: 11.00GiB freeMemory: 9.02GiB
2018-08-08 09:25:56.708385: I D:\MyProject\tensorflow-1.10.0-rc1\tensorflow\core\common_runtime\gpu\gpu_device.cc:1404] Found device 1 with properties:
name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.582
pciBusID: 0000:0b:00.0
totalMemory: 11.00GiB freeMemory: 9.02GiB
2018-08-08 09:25:56.731390: I D:\MyProject\tensorflow-1.10.0-rc1\tensorflow\core\common_runtime\gpu\gpu_device.cc:1483] Adding visible gpu devices: 0, 1
2018-08-08 09:26:04.117910: I D:\MyProject\tensorflow-1.10.0-rc1\tensorflow\core\common_runtime\gpu\gpu_device.cc:964] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-08-08 09:26:04.131670: I D:\MyProject\tensorflow-1.10.0-rc1\tensorflow\core\common_runtime\gpu\gpu_device.cc:970]      0 1
2018-08-08 09:26:04.142367: I D:\MyProject\tensorflow-1.10.0-rc1\tensorflow\core\common_runtime\gpu\gpu_device.cc:983] 0:   N N
2018-08-08 09:26:04.152745: I D:\MyProject\tensorflow-1.10.0-rc1\tensorflow\core\common_runtime\gpu\gpu_device.cc:983] 1:   N N
2018-08-08 09:26:04.173833: E D:\MyProject\tensorflow-1.10.0-rc1\tensorflow\core\common_runtime\gpu\gpu_process_state.cc:105] Invalid allocator type: 0,1
2018-08-08 09:26:04.189278: E D:\MyProject\tensorflow-1.10.0-rc1\tensorflow\core\common_runtime\direct_session.cc:158] Internal: Failed to get memory allocator for TF GPU 0 with 11811160064 bytes of memory.
ERROR: Creating Session failed...
Internal: Failed to create session.
Press any key to continue......

According to the prompt, I switched to "/gpu/:0" and "/device:GPU:0" as the id of the gpu. But the prompt resolution failed, as follows:

2018-08-08 09:31:07.052736: I D:\MyProject\tensorflow-1.10.0-rc1\tensorflow\core\platform\cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX AVX2
2018-08-08 09:31:07.643228: E D:\MyProject\tensorflow-1.10.0-rc1\tensorflow\core\common_runtime\direct_session.cc:158] Invalid argument: Could not parse entry in 'visible_device_list': '/device:GPU:0'. visible_device_list = /device:GPU:0
ERROR: Creating Session failed...
Internal: Failed to create session.

2018-08-08 09:32:28.753232: I D:\MyProject\tensorflow-1.10.0-rc1\tensorflow\core\platform\cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX AVX2
2018-08-08 09:32:29.082282: E D:\MyProject\tensorflow-1.10.0-rc1\tensorflow\core\common_runtime\direct_session.cc:158] Invalid argument: Could not parse entry in 'visible_device_list': '/gpu:0'. visible_device_list = /gpu:0
ERROR: Creating Session failed...
Internal: Failed to create session.

Then I found the same error in the issues of /github/tensorflow. I tried the following methods according to their method:

Follow these plans #5379
1. {tf_root}\tensorflow\tf_version_script.lds
Modify this file, add "protobuf;"
failure!
2. add the corresponding lib.

tf_core_gpu_kernelss.lib
training_ops_gen_cc.lib
transform_graph.lib
tf_protos_cc.lib
user_ops_gen_cc.lib

failure!

But if I use the following method：

config.mutable_gpu_options()->set_visible_device_list("")

tensorflow::graph::SetDefaultDevice("", &graphdef)

This can pass, and run, but still only one gpu is called！

I found the same error in this issue#18861, but I didn't find a C++ solution below, so I suspect it is my tensorflow problem, I recompile 1.9.0 and the latest 1.10.0-rc1. But get the same error

Could someone help me solve this problem? └(^o^)┘
I'm really really appreciate it!
Thank you for replay me!

Solution

I may have found a solution, but today the test has not met my requirements.

tensorflow::SessionOptions options;
tensorflow::ConfigProto &config = options.config;
auto* device_count = config.mutable_device_count();
/*device_count->insert({ "CPU", 1 });*/
//device_count->insert({ "GPU", 1 });//1 represents one gpu, not the "/gpu:0"
device_count->insert({ "GPU", 2 });//2 represents two gpu, it is "/gpu:0" and "/gpu:1"

Session* session;
Status status = NewSession(options, &session);//creat new Session
std::vector<DeviceAttributes> response;
session->ListDevices(&response);

//print the device list
for (int temIndex = 0; temIndex < response.size(); ++temIndex) {
    auto temValue= response[temIndex];
    std::cout << "ListDevices(): " << temIndex << "  " << response[temIndex].name() << std::endl;
}

Using this method is the same as the following method:

options.config.mutable_gpu_options()->set_visible_device_list("");

it is still impossible to clearly define the gpu to be used, and still all the calculations are placed on one gpu, I think this may be my method still has problems.

But I feel like I am going to find a solution....