Search code examples
machine-learningregressionlogistic-regressiongradient-descent

Logistic Regression using Gradient Descent


The given data on student exam results and our goal is to predict whether a student will pass or fail based on number of hours slept and hours spent studying. We have two features (hours slept, hours studied) and two classes: passed (1) and failed (0).

Studied Slept Passed
4.85    9.63  1
8.62    3.23  0
5.43    8.23  1
9.21    6.34  0

Can anyone please explain how to calculate the first two iterations of cost??


Solution

  • Let's say one iteration of your training accepts one sample from your dataset. Since you are using logistic regression, your initial prediction is going to be

    p = sigmoid(x1*w1 + x2*w2 + b)
    

    where x1 and x2 are the studied and slept input values. w1 and w2 are the weights of your model, and b is the bias vector. This will return a value between 0 and 1.

    Also note that I have named the value on the left p, since this is the models predicted value.

    The cost for this iteration, if you're using binary cross entropy, is going to be:

    −(y*log(p) + (1−y)*log(1−p))
    

    where y is the true value of the input sample. If we take the first input sample in your training data, the true value is 1.

    Therefore, your cost, J, is:

    J = -(1*log(p) + (1-1)*log(1-p))
      = -log(p)
    

    Since we now know the value of the cost, which is a function of the parameters of the model, we can use the chain rule to find the gradient of the parameters with respect to the cost (or error).

    You would then update the parameters of the model with the following equations:

    wn = wn - alpha * dJ/dwn
    b  = b  - alpha * dJ/db
    

    where n = 1, 2 and alpha is the learning rate of your model.

    For the next iteration, you would just do the same process with another sample from your dataset.