Search code examples
machine-learningoctavelinear-regressiongradient-descent

Gradient descent does not converge for linear regression


I've written a simple linear regression algorithm in Octave, but no matter what learning rate and number of iterations I chose, and even drawing out the matrices on paper, the values for theta never converge. Can anyone see any mistakes in my code?

data = load('ex1data2.txt');
X = data(:,1:2);
y = data(:,3);          
m = rows(X);        
X = [ones(m,1), data(:,1:2)];   

alpha = 0.01;

iterations = 5000;          

n = columns(X);                 

theta = zeros(n,1);

for count = 1:iterations

    hypo = zeros(1,m);
    hypo = theta'*X';
    sqr_err = (hypo-y').*(hypo-y');
    sum_sqr_err = sum(sqr_err);
    J = 1/(2*m)*sum_sqr_err;

    for i = 1:n
        theta(i) = theta(i)-(alpha/m)*((hypo-y')*X(:,i));
    end

end

J
theta

Thanks.


Solution

  • In Matlab online, this converged for me in 47 iterations:

    data = load('ex1data2.txt');
    X = data(:,1:2);
    y = data(:,3);
    
    mu = mean(X); % Mean.
    s = max(X) - min(X); % Range.
    X = X - mu;
    X = X ./ s; 
    
    m = size(X, 1); % Number of rows.
    n = size(X, 2); % Number of columns.
    X = [ones(m, 1) X]; % Add columns of ones to add biasing.
    theta = zeros(n+1,1); % Initializing theta.
    
    J = costFunction(X,y,theta);
    
    alpha = 2.02;
    iterations = 100;          
    
    j_hist = zeros(iterations,1); % Initializing j_hist.
    m = size(X, 1);
    n = size(theta,1);
    
    for i=1 : iterations
        hypo = X*theta;
        for j = 1 : n
            theta(j) = theta(j) - alpha*(1/m)*sum((hypo - y).*X(:,j));
        end
        j_hist(i) = costFunction(X,y,theta);
    end
    
    function J = costFunction(X,y,theta)
        prediction = X*theta;
        m = size(X, 1);
        sqError = (prediction - y).^2; % ignore negative vals.
        J = (1/(2*m))*sum(sqError); % derivative of sqroot cancels.
    end
    
    %J
    %j_hist
    %theta
    

    Then, after running you can inspect J, j_hist and theta individually one at a time if you want by uncommenting them.