I am learning Gadient descent to find the minimum of a function. There I found a line of code as shown
m1' = m1 - alpha* d/dm1 j(m0,m1) # m0,m1 are weights, j(m0,m1) is the loss function
It is stated that the partial derivative of the cost function gives the "direction of fastest" decrease of cost. Can someone explain / elaborate it it?
Take a level set curve. From one level set curve to another level set curve, the shortest path is the perpendicular path which is the direction of the derivative at that point which can be proven mathematically. Here m0 and m1 are two axes(x, y) in the graph and level curve denotes the same valued cuts in J(m0, m1) which is in the z-direction. for more on level sets https://mathinsight.org/level_sets
In the graph above, imagine if you choose a different direction other than the direction of the derivative, then you will arrive at a level set curve which has a higher value than the optimal level set curve(in case you are finding a minimum). Or you can think like that you have to go longer distance(than the shortest path) to arrive at the same level set curve we are expecting.