I'm following the tutorial to retrain the inception model adapted to my own problem. I have about 50 000 images in around 100 folders / categories.
Running this
bazel build tensorflow/examples/image_retraining:retrain
bazel-bin/tensorflow/examples/image_retraining/retrain --image_dir /path/to/root_folder_name
on Amazon EC2 g2.2xlarge I was hoping that the full process would be quite fast (faster than on my laptop) but the bottleneck files creation takes a long time. Assuming it's already been 2 hours and only 800 files have been created, I will need more than 5 days (!!) to just create the files...
Is it supposed to be faster than this rythm ( ~ 400 bottleneck files created / hour) because of the GPU ?
How could I make the process faster ?
Finally found the answer to my question.
Bazel was working without GPU support. To solve this, I modified files regarding these issues :
and ran
TF_UNOFFICIAL_SETTING=1 ./configure
bazel build -c opt --config=cuda tensorflow/examples/image_retraining:retrain --verbose_failures
bazel-bin/tensorflow/examples/image_retraining/retrain --image_dir ~/Images/
At the end of the day, the process was a lot faster (500 images / second) and the training itself was also done with the GPU !