java machine-learning neural-network image-recognition encog

Neural Network Training Criteria: How to Train on Multiple Categories (i.e. shape and color) Without Over-Training

I've been exploring image recognition via Neural Networks. After some research, I started with Encog and their "ImageNeuralNetwork.java" example.

In their example, they use one image per US currency coin (penny, dime, etc) as a training set and then identify a given image of a coin accordingly.

Now I want to use their example as a starting point to practice with different images. I'm trying to use shapes/colors as training. For example, I want the program to recognize the difference between a red circle and red rectangle, but I also want to recognize the difference between a red circle and a blue circle.

I remember reading that you shouldn't over-train and give every possible combination of training images (as in giving 4 images in this case, of 2 differently colored circles and 2 differently colored rectangles).

Would I still be able to use Encog's coin identification example to train on multiple categories (shape and color) or is this another concept? Is there a particular minimum number of training images I can provide without providing every possible color/shape combination and thus over-training?

Solution

When it comes to avoid over-training there are no reliable thumb rules. It totally depends on the structure of your network and features of your data. Most people who construct neural networks manage the problem of over-training (or over-fitting) by trial and error. As long as your network is classifying training data with high accuracy and testing data with poor accuracy, you are over-training and you will need to reduce your training iterations and build the network again and keep repeating this. So to answer your second question there is no particular minimum number of images.

As for your first question, you can definitely train on multiple categories and there are several ways to do this... either by having multiple output neurons for each category or by having an encoded output... but most commonly having a separate network for each category works better. Also for color or shape recognition principal component analysis works better compared to neural network in most cases.