machine-learning neural-network training-data

The dilemma of overfitting in NN training

My question is in continuation to the one asked by another user: What's is the difference between train, validation and test set, in neural networks?

Once learning is over by terminating when the minimum MSE is reached by looking at the validation and train set performance (easy to do so using nntool box in Matlab), then using the trained net structure if the performance of the unseen test set is slightly poor than the training set we have an overfitting problem. I am always encountering this case eventhough the model for which during learning the parameters corresponding to validation and train set having nearly same performance is selected. Then how come the test set performance is worse than the train set?

Solution

Training data= Data we use to train our model.

Validation data= Data we use to test our model on every-epoch or on run-time So that we can early stop our model manually because of over-fitting or any other model. Now Suppose I am running 1000 epochs on my model and on 500 epochs I view that my model is giving 90% accuracy on training data and 70% accuracy on validation data. Now I can see that my model is over-fitting. I can manually stop my training and before 1000 epochs complete and tune my model more and than see the behavior.

Testing data= Now after completing my training on model after computing 1000 epochs. I will predict my test data and see the accuracy on test data. its giving 86%

My training accuracy is 90% validation accuracy is 87% and testing accuracy is 86%. this may vary because data in validation set, training set and testing set are totally different. We have 70% samples in training set 10% validation and 20% testing set. Now on my validation my model is predicting 8 images correctly and on testing my model predicting 18 images correctly out of 100. Its normal in real life projects because pixels in every image are varying form the other image thats why a little difference may happen.

In testing set their are more images than validation set that may be one reason. Because more the images more the risk of wrong prediction. e.g on 90% accuracy my model predict 90 out of 100 correctly but if I increase the image sample to 1000 than my model may predict (850, 800 or 900) images correctly out 1000 on