I have installed Tensorflow, bazel both latest version.
To train a model from scratch I have to run the following command on this link https://github.com/tensorflow/models:
bazel-bin/inception/imagenet_train --num_gpus=1 --batch_size=32 --train_dir=/tmp/imagenet_train --data_dir=/tmp/imagenet_data
It gives an error
bazel-bin/inception/image_train: No such file or directory
bazel-bin seems to be file and not a directory.
Further, if try to go to /models/inception/inception path and try to run the imagenet_train.py file it throws an error:
command not found error
I have no idea why isn't it working. I have followed each and every step. This is bugging me for a long time now.
Original answer:
You have to build
imagenet_train
first, what is the output when you runbazel build //inception:imagenet_train
?
bazel-bin
is a symbolic link to a directory.
Based on your comment below (~/models#
), it looks like you're running Bazel in the wrong directory. You have to cd
to the inception/
directory before running bazel
:
cd inception
/opt/DL/bazel/bin/bazel build //inception:imagenet_train
Tensorflow chose a very odd project structure: models/
is a project, but every subdirectory of models
is also its own project. I'm not sure why they did this, but you have to build the subprojects (like inception
) in their own directories, not the top-level directory.
//inception:imagenet_train
is called a target. Everything before the :
tells you where the target is defined (the BUILD file in the inception/ directory). Tensorflow made this even more confusing by putting everything in a subdirectory named the same way as its project (e.g., this target is defined in ~/models/inception/inception/BUILD
).
imagenet_train
is the identifier for the target, you can see its definition here.
See Bazel's Getting Started docs for a more detailed explanation of what a target is.