Search code examples
deep-learningstochastic-gradient

stochastic graident for deep learning


I am reading about deep learning concept stochastic gradient. Here in below snap shot I am not understanding what does statement "The general problem with taking a significant step in this direction, however, is that the gradient could be changing under our feet as we move!" We demonstrate this simple fact in following figure. I am not able to how to interpret this figure. Kindly explain

enter image description here


Solution

  • We want to reduce error between the predicted value and the actual value. Consider the actual and predicted values as a point in 2D. You should move the point of predicted value as close as the point of actual value. To move the point you need a direction and SGD provides it.

    enter image description here

    Look at the image, C, the center of contours is the actual value, and P1 is the first predicted value and the SGD (blue arrow) shows a direction that reduces the distance between P1 and C. If you start from P1 and you take a significant(big) step in the first arrow direction, you will end at P2 which is far from C. However, if you take small steps(the blue dots), and in each step you move based on new SGD direction (the blue arrows in each point) you will get to a point close to C.

    Big steps make you fluctuate around the actual value, also too small steps take too long to get to the actual value. Most of the time, we use big steps in start of learning process then make it small and small.