I'm using a laptop with GPU 0: Intel(R) HD Graphics 630 and GPU 1: GTX1050 TI. I just finished setting up YOLO environment in Anaconda Environment using the following tutorial: https://appliedmachinelearning.blog/2018/05/27/running-yolo-v2-for-real-time-object-detection-on-videos-images-via-darkflow/
The problem is: whenever I try to render a video with YOLO in Anaconda environment using GPU
python flow --model cfg/yolo.cfg --load bin/yolov2.weights --demo videofile.mp4 --saveVideo --gpu 0.5
the video does render successfully, BUT, my CPU usage goes up to almost 100% (task manager), while my GPU is not used at all. I tried to specify GPU name by adding --gpuName /gpu:1 to the end, but still, CPU is used. Here are the output lines copied from Anaconda Prompt.
(df) C:\Users\User\Videos\PC-programming\darkflow-master>python flow --model cfg/yolo.cfg --load bin/yolov2.weights --demo videofile.mp4 --saveVideo --gpu 0.5
Parsing ./cfg/yolov2.cfg
Parsing cfg/yolo.cfg
Loading bin/yolov2.weights ...
Successfully identified 203934260 bytes
Finished in 0.022666454315185547s
Model has a coco model name, loading coco labels.
Building net ...
Source | Train? | Layer description | Output size
-------+--------+----------------------------------+---------------
| | input | (?, 608, 608, 3)
Load | Yep! | conv 3x3p1_1 +bnorm leaky | (?, 608, 608, 32)
Load | Yep! | maxp 2x2p0_2 | (?, 304, 304, 32)
Load | Yep! | conv 3x3p1_1 +bnorm leaky | (?, 304, 304, 64)
Load | Yep! | maxp 2x2p0_2 | (?, 152, 152, 64)
Load | Yep! | conv 3x3p1_1 +bnorm leaky | (?, 152, 152, 128)
Load | Yep! | conv 1x1p0_1 +bnorm leaky | (?, 152, 152, 64)
Load | Yep! | conv 3x3p1_1 +bnorm leaky | (?, 152, 152, 128)
Load | Yep! | maxp 2x2p0_2 | (?, 76, 76, 128)
Load | Yep! | conv 3x3p1_1 +bnorm leaky | (?, 76, 76, 256)
Load | Yep! | conv 1x1p0_1 +bnorm leaky | (?, 76, 76, 128)
Load | Yep! | conv 3x3p1_1 +bnorm leaky | (?, 76, 76, 256)
Load | Yep! | maxp 2x2p0_2 | (?, 38, 38, 256)
Load | Yep! | conv 3x3p1_1 +bnorm leaky | (?, 38, 38, 512)
Load | Yep! | conv 1x1p0_1 +bnorm leaky | (?, 38, 38, 256)
Load | Yep! | conv 3x3p1_1 +bnorm leaky | (?, 38, 38, 512)
Load | Yep! | conv 1x1p0_1 +bnorm leaky | (?, 38, 38, 256)
Load | Yep! | conv 3x3p1_1 +bnorm leaky | (?, 38, 38, 512)
Load | Yep! | maxp 2x2p0_2 | (?, 19, 19, 512)
Load | Yep! | conv 3x3p1_1 +bnorm leaky | (?, 19, 19, 1024)
Load | Yep! | conv 1x1p0_1 +bnorm leaky | (?, 19, 19, 512)
Load | Yep! | conv 3x3p1_1 +bnorm leaky | (?, 19, 19, 1024)
Load | Yep! | conv 1x1p0_1 +bnorm leaky | (?, 19, 19, 512)
Load | Yep! | conv 3x3p1_1 +bnorm leaky | (?, 19, 19, 1024)
Load | Yep! | conv 3x3p1_1 +bnorm leaky | (?, 19, 19, 1024)
Load | Yep! | conv 3x3p1_1 +bnorm leaky | (?, 19, 19, 1024)
Load | Yep! | concat [16] | (?, 38, 38, 512)
Load | Yep! | conv 1x1p0_1 +bnorm leaky | (?, 38, 38, 64)
Load | Yep! | local flatten 2x2 | (?, 19, 19, 256)
Load | Yep! | concat [27, 24] | (?, 19, 19, 1280)
Load | Yep! | conv 3x3p1_1 +bnorm leaky | (?, 19, 19, 1024)
Load | Yep! | conv 1x1p0_1 linear | (?, 19, 19, 425)
-------+--------+----------------------------------+---------------
GPU mode with 0.5 usage
2018-10-16 17:21:18.897583: W C:\tf_jenkins\home\workspace\rel-win\M\windows\PY\36\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2018-10-16 17:21:18.904824: W C:\tf_jenkins\home\workspace\rel-win\M\windows\PY\36\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
Finished in 4.7458178997039795s
Press [ESC] to quit demo
0.719 FPS ......
Then, if I try to render image instead, still, task manager shows that GPU is not used at all.
(df) C:\Users\User\Videos\PC-programming\darkflow-master>python flow --model cfg/yolo.cfg --load bin/yolov2.weights --imgdir sample_img --gpu 0.9
Parsing ./cfg/yolov2.cfg
Parsing cfg/yolo.cfg
Loading bin/yolov2.weights ...
Successfully identified 203934260 bytes
Finished in 0.021943330764770508s
Model has a coco model name, loading coco labels.
Building net ...
Source | Train? | Layer description | Output size
-------+--------+----------------------------------+---------------
| | input | (?, 608, 608, 3)
Load | Yep! | conv 3x3p1_1 +bnorm leaky | (?, 608, 608, 32)
Load | Yep! | maxp 2x2p0_2 | (?, 304, 304, 32)
Load | Yep! | conv 3x3p1_1 +bnorm leaky | (?, 304, 304, 64)
Load | Yep! | maxp 2x2p0_2 | (?, 152, 152, 64)
Load | Yep! | conv 3x3p1_1 +bnorm leaky | (?, 152, 152, 128)
Load | Yep! | conv 1x1p0_1 +bnorm leaky | (?, 152, 152, 64)
Load | Yep! | conv 3x3p1_1 +bnorm leaky | (?, 152, 152, 128)
Load | Yep! | maxp 2x2p0_2 | (?, 76, 76, 128)
Load | Yep! | conv 3x3p1_1 +bnorm leaky | (?, 76, 76, 256)
Load | Yep! | conv 1x1p0_1 +bnorm leaky | (?, 76, 76, 128)
Load | Yep! | conv 3x3p1_1 +bnorm leaky | (?, 76, 76, 256)
Load | Yep! | maxp 2x2p0_2 | (?, 38, 38, 256)
Load | Yep! | conv 3x3p1_1 +bnorm leaky | (?, 38, 38, 512)
Load | Yep! | conv 1x1p0_1 +bnorm leaky | (?, 38, 38, 256)
Load | Yep! | conv 3x3p1_1 +bnorm leaky | (?, 38, 38, 512)
Load | Yep! | conv 1x1p0_1 +bnorm leaky | (?, 38, 38, 256)
Load | Yep! | conv 3x3p1_1 +bnorm leaky | (?, 38, 38, 512)
Load | Yep! | maxp 2x2p0_2 | (?, 19, 19, 512)
Load | Yep! | conv 3x3p1_1 +bnorm leaky | (?, 19, 19, 1024)
Load | Yep! | conv 1x1p0_1 +bnorm leaky | (?, 19, 19, 512)
Load | Yep! | conv 3x3p1_1 +bnorm leaky | (?, 19, 19, 1024)
Load | Yep! | conv 1x1p0_1 +bnorm leaky | (?, 19, 19, 512)
Load | Yep! | conv 3x3p1_1 +bnorm leaky | (?, 19, 19, 1024)
Load | Yep! | conv 3x3p1_1 +bnorm leaky | (?, 19, 19, 1024)
Load | Yep! | conv 3x3p1_1 +bnorm leaky | (?, 19, 19, 1024)
Load | Yep! | concat [16] | (?, 38, 38, 512)
Load | Yep! | conv 1x1p0_1 +bnorm leaky | (?, 38, 38, 64)
Load | Yep! | local flatten 2x2 | (?, 19, 19, 256)
Load | Yep! | concat [27, 24] | (?, 19, 19, 1280)
Load | Yep! | conv 3x3p1_1 +bnorm leaky | (?, 19, 19, 1024)
Load | Yep! | conv 1x1p0_1 linear | (?, 19, 19, 425)
-------+--------+----------------------------------+---------------
GPU mode with 0.9 usage
2018-10-16 17:07:30.439641: W C:\tf_jenkins\home\workspace\rel-win\M\windows\PY\36\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2018-10-16 17:07:30.449381: W C:\tf_jenkins\home\workspace\rel-win\M\windows\PY\36\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
Finished in 5.261923789978027s
Forwarding 8 inputs ...
Total time = 10.975605964660645s / 8 inps = 0.7288891406778334 ips
Post processing 8 inputs ...
Total time = 0.48075294494628906s / 8 inps = 16.640563690969756 ips
What's wrong >_< ??? Thanks in advance!
clone https://github.com/thtrieu/darkflow,
download necessary .cfg and weights from https://pjreddie.com/darknet/yolo/,
save them in cfg folder and a new bin folder respectively inside darkflow-master.
conda create -n darkflow-env python=3.6
activate darkflow-env
pip install tensorflow-gpu
(pip, not conda. This step should also install CUDA and cuDNN automatically, no separate download needed.)
conda install cython numpy
conda config --add channels conda-forge
conda install opencv
go to your darkflow-master folder and copy the path
cd to the path (still using the Anaconda Prompt)
python setup.py build_ext --inplace
python flow --model cfg/yolo.cfg --load bin/yolov2.weights --demo videofile.mp4 --saveVideo --gpu 0.7
(the videofile.mp4 is the video to be rendered, I put it inside darkflow-master folder directly)
wait and you'll see output video in the darkflow-master folder (on my laptop with GTX1050 TI graphic card, the rendering speed is around 8.5 FPS).
If you encounter any problem like Microsoft Visual C++ Build Tools is required, then just download from Microsoft website, remember to install the SDK along with the installation too. For more information about this problem you may refer to https://github.com/thtrieu/darkflow/issues/788.