Search code examples
opencvtensorflowgoogle-cloud-platformgoogle-visionvision-api

How to "Iterate" on Computer Vision machine learning model?


I've created a model using google clouds vision api. I spent countless hours labeling data, and trained a model. At the end of almost 20 hours of "training" the model, it's still hit and miss.

How can I iterate on this model? I don't want to lose the "learning" it's done so far.. It works about 3/5 times.

My best guess is that I should loop over the objects again, find where it's wrong, and label accordingly. But I'm not sure of the best method for that. Should I be labeling all images where it "misses" as TEST data images? Are there best practices or resources I can read on this topic?


Solution

  • I'm by no means an expert, but here's what I'd suggest in order of most to least important:

    1) Add more data if possible. More data is always a good thing, and helps develop robustness with your network's predictions.

    2) Add dropout layers to prevent over-fitting

    3) Have a tinker with kernel and bias initialisers

    4) [The most relevant answer to your question] Save the training weights of your model and reload them into a new model prior to training.

    5) Change up the type of model architecture you're using. Then, have a tinker with epoch numbers, validation splits, loss evaluation formulas, etc.

    Hope this helps!


    EDIT: More information about number 4

    So you can save and load your model weights during or after the model has trained. See here for some more in-depth information about saving.

    Broadly, let's cover the basics. I'm assuming you're going through keras but the same applies for tf:

    Saving the model after training

    Simply call:

    model_json = model.to_json()
    with open("{Your_Model}.json", "w") as json_file:
        json_file.write(model_json)
    
    # serialize weights to HDF5
    model.save_weights("{Your_Model}.h5")
    print("Saved model to disk")
    

    Loading the model

    You can load the model structure from json like so:

    from keras.models import model_from_json 
    
    json_file = open('{Your_Model.json}', 'r')
    loaded_model_json = json_file.read()
    json_file.close()
    model = model_from_json(loaded_model_json)
    

    And load the weights if you want to:

    model.load_weights('{Your_Weights}.h5', by_name=True)
    

    Then compile the model and you're ready to retrain/predict. by_name for me was essential to re-load the weights back into the same model architecture; leaving this out may cause an error.

    Checkpointing the model during training

    cp_callback = tf.keras.callbacks.ModelCheckpoint(filepath={checkpoint_path},
                                                     save_weights_only=True,
                                                     verbose=1)
    
    # Train the model with the new callback
    model.fit(train_images, 
              train_labels,  
              epochs=10,
              validation_data=(test_images,test_labels),
              callbacks=[cp_callback])  # Pass callback to training