I have an RNN model. After about 10K iterations, the loss stops decreasing, but the loss is not very small yet. Does it always mean the optimization is trapped in a local minimum?
In general, what would be the actions should I take to address this issue? Add more training data? Change a different optimization scheme (SGD now)? Or Other options?
Many thanks!
JC
If you are training you neural network using a gradient vector based algorithm such as Back Propagation
or Resilient Propagation
it can stop improving when it finds a local minimum and it is normal because of the nature of this type fo algorithm. In this case, the propagation algorithms is used to search what a (gradient) vector is pointing.
As a suggestion you could add a different strategy during the training to explore the space of search instead only searching. For sample, a Genetic Algorithm
or the Simulated Annealing
algorithm. These approaches will provide a exploration of possibilities and it can find a global minimum. You could implement 10 itegrations for each 200 iterations of the propagation algorithm, creating a hybrid strategy. For sample (it's just a pseudo-code):
int epochs = 0;
do
{
train();
if (epochs % 200 == 0)
traingExplorativeApproach();
epochs++;
} while (epochs < 10000);
I've developed a strategy like this using Multi-Layer Perceptrons
and Elman recurrent neural network
in classification
and regressions
problems and both cases a hybrid strategy has provided better results then a single propagation training.