I've written an implementation in Python using NumPy of vectorized regularized Gradient descent for logistic regression. I've used a numerical check method to check that my implementation is correct. The numerical check verifies my implementation of Linear regression GD, but Logisitc fails, and I cannot find out. Any help would be appreciated. So here goes:
Those are my methods for calculating cost and gradient (update function calculates gradient and updates the parameters):
def _hypothesis(parameters, features):
return Activation.sigmoid(features.dot(parameters))
def _cost_function(parameters, features, targets):
m = features.shape[0]
return np.sum(-targets * (np.log(LogisticRegression._hypothesis(parameters, features)) - (1 - targets) * (
np.log(1 - LogisticRegression._hypothesis(parameters, features))))) / m
def _update_function(parameters, features, targets, extra_param):
regularization_vector = extra_param.get("regularization_vector", 0)
alpha = extra_param.get("alpha", 0.001)
m = features.shape[0]
return parameters - alpha / m * (
features.T.dot(LogisticRegression._hypothesis(parameters, features) - targets)) + \
(regularization_vector / m) * parameters
The cost function doesn't have regularization included, but the test I do is with a regularization vector equal to zero so it does not matter. How I am testing:
def numerical_check(features, parameters, targets, cost_function, update_function, extra_param, delta):
gradients = - update_function(parameters, features, targets, extra_param)
parameters_minus = np.copy(parameters)
parameters_plus = np.copy(parameters)
parameters_minus[0, 0] = parameters_minus[0, 0] + delta
parameters_plus[0, 0] = parameters_plus[0, 0] - delta
approximate_gradient = - (cost_function(parameters_plus, features, targets) -
cost_function(parameters_minus, features, targets)) / (2 * delta) / parameters.shape[0]
return abs(gradients[0, 0] - approximate_gradient) <= delta
Basically, I am manually calculating the gradient when I shift the first parameter delta amount to the left and to the right. And then I compare it with the gradients I get from the update function. I am using initial parameters equal to 0 so the updated parameter received is equal to the gradient divided by and the number of features. Also alpha is equal to one. Unfortunately, I am getting different values from the two methods and I cannot find out why. Any advice on how to troubleshoot this problem would be really appreciated.
there is an error in your cost function. error is due to invalid distribution of brackets. i've fixed that
def _cost_function(parameters, features, targets):
m = features.shape[0]
return -np.sum(
( targets) * (np.log( LogisticRegression._hypothesis(parameters, features)))
+ (1 - targets) * (np.log(1 - LogisticRegression._hypothesis(parameters, features)))
) / m
try writing your code cleanly, it helps to detect errors like these