Search code examples
machine-learningoctavegradient-descent

Octave / Gradient Descent code: GD works fine, but it won't save the output from the cost function


The Gradient Descent part of this code works fine, but can anyone tell me why it's not initialising (or populating) the vector 'J_history'?

Here's the principal code:

data = load('ex1data1.txt'); %2 columns of data - a single x variable and a single y
X = [ones(m, 1), data(:,1)]; %adds a column of 1s to allow for an intercept term
y = data(:, 2);
m = length(y);
theta = zeros(2, 1); %initialising the vector of coefficient estimates at [0; 0]
iterations = 1500; %how many times to iterate the cost function
alpha = 0.01; %adjustment speed
theta = gradientDescent(X, y, theta, alpha, iterations); %call the GD function

The last line of the principal code calls on function gradientDescent:

function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters)
m = length(y);
J_history = zeros(num_iters, 1); %I don't understand why this doesn't initialise (or generate an error)!
for iter = 1:num_iters
theta = theta - ((alpha/m)*(X*theta-y)'*X)'; %adjusting the coefficient for each iteration
J_history(iter) = computeCost(X, y, theta); %storing the output of the cost function at each iteration - again, I can't figure out why this doesn't work
end
end

And the 'J_history' line in the code above calls on the function computeCost:

function J = computeCost(X, y, theta)
m = length(y);
J = 0;
predictions = X*theta;
sqrErrors = (predictions-y).^2;
J=1/(2*m)*sum(sqrErrors);
end

Thanks in advance for your help


Solution

  • You are calling your gradient function with only one output argument:

    theta = gradientDescent(X, y, theta, alpha, iterations); %call the GD function
    

    Your gradient function was defined to output two arguments:

    function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters)
    

    In order to get the second argument out, you need to actually call this function with both output arguments:

    [ theta, J_history ] = gradientDescent( X, y, theta, alpha, iterations );
    

    In a statement like [A,B,...] = funcname, the [A,B,...] isn't an array; it is special syntax which tells octave how many output arguments to collect. If you only specify one output argument, like you have, then all other output arguments will be discarded.

    See: https://docs.octave.org/latest/Assignment-Ops.html for details.