Search code examples
machine-learningdeep-learningpytorchcross-validationfast-ai

Learning rate & gradient descent difference?


What is the difference between the two?, the two serve to reach the minimum point (lower loss) of a function for example.

I understand (I think) that the learning rate is multiplied by the gradient ( slope ) to make the gradient descent , but is that so ? Do I miss something?

What is the difference between lr and gradient?

Thanks


Solution

  • Deep learning neural networks are trained using the stochastic gradient descent algorithm.

    Stochastic gradient descent is an optimization algorithm that estimates the error gradient for the current state of the model using examples from the training dataset, then updates the weights of the model using the back-propagation of errors algorithm, referred to as simply backpropagation.

    The amount that the weights are updated during training is referred to as the step size or the “learning rate.”

    Specifically, the learning rate is a configurable hyperparameter used in the training of neural networks that has a small positive value, often in the range between 0.0 and 1.0.

    The learning rate controls how quickly the model is adapted to the problem. Smaller learning rates require more training epochs given the smaller changes made to the weights each update, whereas larger learning rates result in rapid changes and require fewer training epochs.

    A learning rate that is too large can cause the model to converge too quickly to a suboptimal solution, whereas a learning rate that is too small can cause the process to get stuck.

    The challenge of training deep learning neural networks involves carefully selecting the learning rate. It may be the most important hyperparameter for the model.

    The learning rate is perhaps the most important hyperparameter. If you have time to tune only one hyperparameter, tune the learning rate.

    — Page 429, Deep Learning, 2016.

    For more on what the learning rate is and how it works, see the post:

    How to Configure the Learning Rate Hyperparameter When Training Deep Learning Neural Networks

    Also you can refer to here: Understand the Impact of Learning Rate on Neural Network Performance