Im working with the gradient function for an exercise but i still couldn't get the expected outcome. That is, i receive 2 error messages:
Wrong output for the loss function. Check how you are implementing the matrix multiplications.
Wrong values for weight's matrix theta. Check how you are updating the matrix of weights.
When applying the function (see below) i notice that the cost decreases at each iteration but still it does not converge to the desired outcome in the exercise. I already tried several adaptations on the formula but couldn't solve it yet.
# gradientDescent
def gradientDescent(x, y, theta, alpha, num_iters):
Input:
x: matrix of features which is (m,n+1)
y: corresponding labels of the input matrix x, dimensions (m,1)
theta: weight vector of dimension (n+1,1)
alpha: learning rate
num_iters: number of iterations you want to train your model for
Output:
J: the final cost
theta: your final weight vector
Hint: you might want to print the cost to make sure that it is going down.
### START CODE HERE ###
# get 'm', the number of rows in matrix x
m = len(x)
for i in range(0, num_iters):
# get z, the dot product of x and theta
# z = predictins
z = np.dot(x, theta)
h = sigmoid(z)
loss = z - y
# calculate the cost function
J = (-1/m) * np.sum(loss)
print("Iteration %d | Cost: %f" % (i, J))#
gradient = np.dot(x.T, loss)
#update theta
theta = theta - (1/m) * alpha * gradient
### END CODE HERE ###
J = float(J)
return J, theta
The issue is that i wrongly applied the formula of the cost function and the formula for calculating the weights:
๐ฝ=โ1/๐ร(๐ฒ๐โ ๐๐๐(๐ก)+(1โ๐ฒ)๐โ ๐๐๐(1โ๐ก))
๐=๐โ๐ผ/๐ร(๐ฑ๐โ (๐กโ๐ฒ))
The solution is:
J = (-1/m) * (np.dot(y.T, np.log(h)) + (np.dot((1-y).T, np.log(1-h)))
theta = theta - (alpha/m) * gradient