Search code examples
pythondata-sciencelinear-regressiongradient-descent

how does the graph of the gradient descent work


I am having a problem understanding gradient descent for example let's take a simple linear regression with 1 feature in which after plotting the regression line the error is calculated Ypred-Yact then the cost function is calculated for each slope and intercept of the regression line. now this cost function is plotted against the slope and intercept to find the lowest value of the cost function with respect to the slope and intercept through the gradient.

Why are we plotting the graph of cost function then finding the lowest value?

The model will be calculating the cost function for different slope and intercept right so cant we identify the lowest value of the function here instead of plotting the graph then finding the gradient and the updating the slope and intercept


Solution

  • When you're making a model based on one training feature and one target feature then you can use direct equation like y=mx+c where,

    m = (n(Σxy)-(Σx)(Σy))/(n(Σx^2)-(Σx)^2)

    c = ((Σy)(Σx^2)-(Σx)(Σxy))/(n(Σx^2)-(Σx)^2)

    But when there are multiple features to train to get the target value, your equation would look like, y=m1x1+m2x2+m3x3+....+c , which is a n dimensional equation.

    single equations which find linear relation between training and target feature won't work here. For multiple feature we would need a line in n dimensions to fit where the mean squared error would be least, as we say finding the lowest value of the cost function.

    And then , about plotting the graph of cost function, the library codes you use don't plot the cost function. You just need to maintain the matrix and iterate to converge.

    For better understanding of the algorithm, click here click here