Search code examples
pythonpython-3.xtensorflowcheckpoint

Tensorflow : NotFoundError: No such file or directory


I am facing tensorflow model weight restoring issue.

So during training the model , I have saved my model checkpoint after each 500 iteration ,

if j%500==0:
                    with open('iterres.txt','a') as f:
                        f.write(str({'epoch': i, 'test_accuracy': evaluate_(model,batch_size=100),'iteration':j}) + '\n')
                        os.system('mkdir ' + str(i)+'epoch'+str(j))
                        saver.save(sess, '/home/g_cloud/exe_paul/'+str(i)+'epoch'+str(j)+'/'+str(i))

Now i had a checkpoint folder with all weights and meta :

I have downloaded the weights and created a new folder with name "new_backup" where all the weights and meta is there :

When i am trying to load that files from that folder :

import tensorflow as tf


labels_dict={
              1: 'Yes', 
              0: 'No'
            }


with tf.Session() as sess:


    saver = tf.train.import_meta_graph('../new_backup/1.meta')
    restore = saver.restore(sess,tf.train.latest_checkpoint('../new_backup/'))
    graph=tf.get_default_graph()

    query= graph.get_tensor_by_name("input:0")
    result=graph.get_tensor_by_name("netout:0")

Then i am getting this error :

NotFoundError: /home/g_cloud/exe_paul/1epoch1000; No such file or directory

That was my cloud account and 1epoch1000 was old folder where all weight was saved during training . My issue is if i go to cloud and use same script for restoring model when there is a folder 1epoch1000 then script is working but otherwise it's giving that error.

How i can change meta to redirect path or how i can restore model anywhere ?


Solution

  • I tried to found the answer but no luck , Then i did some experiment , so when you save your model you will get four files :

    model.data
    model.index
    model.meta
    checkpoint
    

    Now open checkpoint as .txt file where you will see some paths :

    model_checkpoint_path: "/home/g_cloud/exe_paul/1epoch1000/model"
    all_model_checkpoint_paths: "/home/g_cloud/exe_paul/1epoch500/0"
    all_model_checkpoint_paths: "/home/g_cloud/exe_paul/1epoch1000/0"
    all_model_checkpoint_paths: "/home/g_cloud/exe_paul/1epoch2000/1"
    all_model_checkpoint_paths: "/home/g_cloud/exe_paul/1epoch2500/1"
    all_model_checkpoint_paths: "/home/g_cloud/exe_paul/1epoch3000/1"
    

    Just change first path which says model checkpoint to the local path of your machine where your model is.

    After that change path to local path in :

    saver = tf.train.import_meta_graph('../new_backup/1.meta')
    restore = saver.restore(sess,tf.train.latest_checkpoint('../new_backup/'))
    

    And that's it.