Search code examples
tensorflowmachine-learningneural-networktensorboardtraining-data

Avoiding overfitting while training a neural network with Tensorflow


I am training a neural network using Tensorflow's object detetction API to detect cars. I used the following youtube video to learn and execute the process.

https://www.youtube.com/watch?v=srPndLNMMpk&t=65s

Part 1 to 6 of his series.

Now in his video, he has mentioned to stop the training when the loss value reaches ~1 or below on an average and that it would take about 10000'ish' steps.

In my case, it is 7500 steps right now and the loss values keep fluctuating from 0.6 to 1.3.

Alot of people complained in the comment section about false positives on this series but I think this happened because of the unnecessary prolonged process of training (because they din't know maybe when to stop ?) which caused overfitting!

I would like to avoid this problem. I would like to have not the most optimum weights but fairly optimum weights while avoiding false detection or overfitting. I am also observing 'Total Loss' section of Tensorboard. It fluctuates between 0.8 to 1.2. When do I stop the training process?

I would also like to know in general, which factors does the 'stopping of training' depend on? is it always about the average loss of 1 or less?

Additional information: My training data has ~300 images Test data ~ 20 images

Since I am using the concept of transfer learning, I chose ssd_mobilenet_v1.model.

Tensorflow version 1.9 (on CPU) Python version 3.6

Thank you!


Solution

  • You should use a validation test, different from the training set and the test set.

    At each epoch, you compute the loss of both training and validation set. If the validation loss begin to increase, stop your training. You can now test your model on your test set.

    The Validation set size is usually the same as the test one. For example, training set is 70% and both validation and test set are 15% each.

    Also, please note that 300 images in your dataset seems not enough. You should increase it.

    For your other question : The loss is the sum of your errors, and thus, depends on the problem, and your data. A loss of 1 does not mean much in this regard. Never rely on it to stop your training.