Search code examples
gradient-descent

How does the derivative of cost function gives direction of fastest decrease in cost?


I am learning Gadient descent to find the minimum of a function. There I found a line of code as shown

m1' = m1 - alpha* d/dm1 j(m0,m1) # m0,m1 are weights, j(m0,m1) is the loss function

It is stated that the partial derivative of the cost function gives the "direction of fastest" decrease of cost. Can someone explain / elaborate it it?


Solution

  • Take a level set curve. From one level set curve to another level set curve, the shortest path is the perpendicular path which is the direction of the derivative at that point which can be proven mathematically. Here m0 and m1 are two axes(x, y) in the graph and level curve denotes the same valued cuts in J(m0, m1) which is in the z-direction. for more on level sets https://mathinsight.org/level_sets

    enter image description here

    In the graph above, imagine if you choose a different direction other than the direction of the derivative, then you will arrive at a level set curve which has a higher value than the optimal level set curve(in case you are finding a minimum). Or you can think like that you have to go longer distance(than the shortest path) to arrive at the same level set curve we are expecting.