I'm taking the course in Matlab, and I have done a gradient descent implementation but it gives incorrect results.
The code:
for iter = 1:num_iters
sumTheta1 = 0;
sumTheta2 = 0;
for s = 1:m
sumTheta1 = theta(1) + theta(2) .* X(s,2) - y(s);
sumTheta2 = theta(1) + theta(2) .* X(s,2) - y(s) .* X(s,2);
end
theta(1) = theta(1) - alpha .* (1/m) .* sumTheta1;
theta(2) = theta(2) - alpha .* (1/m) .* sumTheta2;
J_history(iter) = computeCost(X, y, theta);
end
This is the important part. I think the implementation of the formula is correct, even though it's not optimized. The formula is:
theta1 = theta1 - (alpha)(1/m)(summation_i^m(theta1 + theta2*x(i)-y(i)))
theta2 = theta2 - (alpha)(1/m)(summation_i^m(theta1 + theta2*x(i)-y(i)))(x(i))
So where could the problem be?
EDIT: CODE updated
function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters)
m = length(y); % number of training examples
J_history = zeros(num_iters, 1);
for iter = 1:num_iters
for s = 1:m
sumTheta1 = ((theta(1) .* X(s,1)) + (theta(2) .* X(s,2))) - (y(s));
sumTheta2 = ((theta(1) .* X(s,1)) + (theta(2) .* X(s,2))) - (y(s)) .* X(s,2);
end
temp1 = theta(1) - alpha .* (1/m) .* sumTheta1;
temp2 = theta(2) - alpha .* (1/m) .* sumTheta2;
theta(1) = temp1;
theta(2) = temp2;
J_history(iter) = computeCost(X, y, theta);
end
end
EDIT(2): Fixed it, working code.
Got it, it was the +Dan hint that did it I will accept his answer and still put the code here to anyone stuck :), cheers.
function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters)
m = length(y); % number of training examples
J_history = zeros(num_iters, 1);
for iter = 1:num_iters
sumTheta1 = 0;
sumTheta2 = 0;
for s = 1:m
sumTheta1 = sumTheta1 + ((theta(1) .* X(s,1)) + (theta(2) .* X(s,2))) - (y(s));
sumTheta2 = sumTheta2 + (((theta(1) .* X(s,1)) + (theta(2) .* X(s,2))) - (y(s))) .* X(s,2);
end
temp1 = theta(1) - alpha .* (1/m) .* sumTheta1;
temp2 = theta(2) - alpha .* (1/m) .* sumTheta2;
theta(1) = temp1;
theta(2) = temp2;
% Save the cost J in every iteration
J_history(iter) = computeCost(X, y, theta);
end
end
At first glance I notice that your sumTheta1
is not actually summing but rather replacing itself each iteration. I think you meant:
sumTheta1 = sumTheta1 + theta(1) + theta(2) .* X(s,2) - y(s);
And the same for sumTheta2
But for future reference you could replace this (corrected) loop:
for s = 1:m
sumTheta1 = theta(1) + theta(2) .* X(s,2) - y(s);
sumTheta2 = theta(1) + theta(2) .* X(s,2) - y(s) .* X(s,2);
end
with this vectorized formula
sumTheta1 = sum(theta(1) + theta(2)*X(:,2) - y);
sumTheta2 = sum(theta(1) + theta(2)*X(:,2) - y.*X(:,2))