Search code examples

Incorrect Results from Gradient Descent in Matlab

I'm taking the course in Matlab, and I have done a gradient descent implementation but it gives incorrect results.

The code:

for iter = 1:num_iters

sumTheta1 = 0;
sumTheta2 = 0;
for s = 1:m
    sumTheta1 = theta(1) + theta(2) .* X(s,2) - y(s);
    sumTheta2 = theta(1) + theta(2) .* X(s,2) - y(s) .* X(s,2);

theta(1) = theta(1) - alpha .* (1/m) .* sumTheta1;
theta(2) = theta(2) - alpha .* (1/m) .* sumTheta2;

J_history(iter) = computeCost(X, y, theta);


This is the important part. I think the implementation of the formula is correct, even though it's not optimized. The formula is:

theta1 = theta1 - (alpha)(1/m)(summation_i^m(theta1 + theta2*x(i)-y(i)))
theta2 = theta2 - (alpha)(1/m)(summation_i^m(theta1 + theta2*x(i)-y(i)))(x(i))

So where could the problem be?

EDIT: CODE updated

function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters)

m = length(y); % number of training examples
J_history = zeros(num_iters, 1);

for iter = 1:num_iters

for s = 1:m

sumTheta1 = ((theta(1) .* X(s,1)) + (theta(2) .* X(s,2))) - (y(s));
sumTheta2 = ((theta(1) .* X(s,1)) + (theta(2) .* X(s,2))) - (y(s)) .* X(s,2);

temp1 = theta(1) - alpha .* (1/m) .* sumTheta1;
temp2 = theta(2) - alpha .* (1/m) .* sumTheta2;

theta(1) = temp1;
theta(2) = temp2;

J_history(iter) = computeCost(X, y, theta);



EDIT(2): Fixed it, working code.

Got it, it was the +Dan hint that did it I will accept his answer and still put the code here to anyone stuck :), cheers.

function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters)

 m = length(y); % number of training examples
 J_history = zeros(num_iters, 1);

for iter = 1:num_iters

sumTheta1 = 0;
sumTheta2 = 0;

for s = 1:m

sumTheta1 = sumTheta1 + ((theta(1) .* X(s,1)) + (theta(2) .* X(s,2))) - (y(s));
sumTheta2 = sumTheta2 + (((theta(1) .* X(s,1)) + (theta(2) .* X(s,2))) - (y(s))) .* X(s,2);

temp1 = theta(1) - alpha .* (1/m) .* sumTheta1;
temp2 = theta(2) - alpha .* (1/m) .* sumTheta2;

theta(1) = temp1;
theta(2) = temp2;

% Save the cost J in every iteration    
J_history(iter) = computeCost(X, y, theta);




  • At first glance I notice that your sumTheta1 is not actually summing but rather replacing itself each iteration. I think you meant:

    sumTheta1 = sumTheta1 + theta(1) + theta(2) .* X(s,2) - y(s);

    And the same for sumTheta2

    But for future reference you could replace this (corrected) loop:

    for s = 1:m
        sumTheta1 = theta(1) + theta(2) .* X(s,2) - y(s);
        sumTheta2 = theta(1) + theta(2) .* X(s,2) - y(s) .* X(s,2);

    with this vectorized formula

    sumTheta1 = sum(theta(1) + theta(2)*X(:,2) - y);
    sumTheta2 = sum(theta(1) + theta(2)*X(:,2) - y.*X(:,2))