Search code examples
tensorflowobject-detectionobject-detection-api

NotFoundError : ; on tensorflow 1.5 object detection API, running smoothly on 1.4


I recently upgraded one of my small ubuntu (16.04) servers from tensorflow-gpu 1.4 to tensorflow-gpu 1.5 for working with the object detection API. I have git cloned the latest version API that is supposed to work with tensorflow 1.5.

CUDA/cudNN and other tensorflow programs are up and running after the upgrade, and all test-scripts in the object detection API are running fine.

Despite this, when I attempt to run train.py it fails immediately with the following error:

File "/home/arvid/ownCloud/tensorflow/models/research/object_detection/train.py", line 167, in <module> tf.app.run()
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/platform/app.py", line 124, in run _sys.exit(main(argv))
File "/home/arvid/ownCloud/tensorflow/models/research/object_detection/train.py", line 107, in main overwrite=True)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/lib/io/file_io.py", line 385, in copy compat.as_bytes(oldpath), compat.as_bytes(newpath), overwrite, status)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/errors_impl.py", line 473, in __exit__ c_api.TF_GetCode(self.status.status)) 

tensorflow.python.framework.errors_impl.NotFoundError: ; No such file or directory

This error arise when some input file is missing, but the problem here is that no file is specified in the error.

Usually the missing file is presented between the comma and the semicolon, but in this error it is just a blank space.

I can reproduce the same error on my working server running tensorflow 1.4 by inserting a space between --train_dir= and the path:

--train_dir= {some_path}

But that is not the case here!

Additional info: when I run train.py the 'train' directory is created at the location I specify, so tensorflow seems to be able to identify paths etc..

Any input on how to debug this would be greatly appreciated!!


Solution

  • (Ok, I'm feeling a bit stupid right now...)

    The solution was simple - the name of the flags for train.py changed with the update...

    It used to be:

    --pipeline_config={some_path}
    

    But now it's:

    --pipeline_config_path={some_path}
    

    Still, it would be useful with a more informative error message...