Search code examples
matlabfunctionderivative

Calculate the derivative of the sum of a mathematical function-MATLAB


In Matlab I want to create the partial derivative of a cost function called J(theta_0, theta_1) (in order to do the calculations necessary to do gradient descent).

enter image description here

The function J(theta_0, theta_1) is defined as:

enter image description here

Lets say h_theta(x) = theta_1 + theta_2*x. Also: alpha is fixed, the starting values of theta_1 and theta_2 are given. Let's say in this example: alpha = 0.1 theta_1 = 0, theta_2 = 1. Also I have all the values for x and y in two different vectors.

VectorOfX = 
5
5
6

VectorOfX = 
6
6
10

Steps I took to try to solve this in Matlab: I have no clue how to solve this problem in matlab. So I started off with trying to define a function in Matlab and tried this:

theta_1 = 0
theta_2 = 1
syms x;
h_theta(x) = theta_1 + t2*x;

This worked, but is not what I really wanted. I wanted to get x^(i), which is in a vector. The next thing I tried was:

theta_1 = 0
theta_2 = 1
syms x;
h_theta(x) = theta_1 + t2*vectorOfX(1);

This gives the following error:

Error using sym/subsindex (line 672)
Invalid indexing or function definition. When defining a
function, ensure that the body of the function is a SYM
object. When indexing, the input must be numeric, logical or
':'.

Error in prog1>gradientDescent (line 46)
h_theta(x) = theta_1 + theta_2*vectorOfX(x);

I looked up this error and don't know how to solve it for this particular example. I have the feeling that I make matlab work against me instead of using it in my favor.


Solution

  • When I have to perform symbolic computations I prefer to use Mathematica. In that environment this is the code to get the partial derivatives you are looking for.

    J[th1_, th2_, m_] := Sum[(th1 + th2*Subscript[x, i] - Subscript[y, i])^2, {i, 1, m}]/(2*m)
    D[J[th1, th2, m], th1]
    D[J[th1, th2, m], th2]
    

    and gives

    Coming back to MATLAB we can solve this problem with the following code

    %// Constants.
    alpha = 0.1;
    theta_1 = 0;
    theta_2 = 1;
    X = [5 ; 5 ; 6];
    Y = [6 ; 6 ; 10];
    
    %// Number of points.
    m = length(X);
    
    %// Partial derivatives.
    Dtheta1 = @(theta_1, theta_2) sum(2*(theta_1+theta_2*X-Y))/2/m;
    Dtheta2 = @(theta_1, theta_2) sum(2*X.*(theta_1+theta_2*X-Y))/2/m;
    
    %// Loop initialization.
    toll = 1e-5;
    maxIter = 100;
    it = 0;
    err = 1;
    theta_1_Last = theta_1;
    theta_2_Last = theta_2;
    
    %// Iterations.
    while err>toll && it<maxIter
        theta_1 = theta_1 - alpha*Dtheta1(theta_1, theta_2);
        theta_2 = theta_2 - alpha*Dtheta2(theta_1, theta_2);
    
        it = it + 1;
        err = norm([theta_1-theta_1_Last ; theta_2-theta_2_Last]);
        theta_1_Last = theta_1;
        theta_2_Last = theta_2;
    end
    

    Unfortunately for this case the iterations does not converge.

    MATLAB is not very flexible for symbolic computations, however a way to get those partial derivatives is the following

    m = 10;
    syms th1 th2
    x = sym('x', [m 1]);
    y = sym('y', [m 1]);
    J = @(th1, th2) sum((th1+th2.*x-y).^2)/2/m;
    diff(J, th1)
    diff(J, th2)