Search code examples
matlaboptimizationmatrix-multiplicationfloating-accuracynumerical

Accuracy issues with multiplication of matrices in Matlab R2012b


I have implemented a script that does constrained optimization for solving the optimal parameters of Support Vector Machines model. I noticed that my script for some reason gives inaccurate results (although very close to the real value). For example the typical situation is that the result of a calculation should be exactly 0, but instead it is something like

-1/18014398509481984 = -5.551115123125783e-17

This situation happens when I multiply matrices with vectors. What makes this also strange is that if I do the multiplications by hand in the command window in Matlab I get exactly 0 result.

Let me give an example: If I take the vectors Aq = [-1 -1 1 1] and x = [12/65 28/65 32/65 8/65]' I get exactly 0 result from their multiplication if I do this in the command window, as you can see in the picture below:

enter image description here

If on the other hand I do this in my function-script I don't get the result being 0 but rather the value -1/18014398509481984.

Here is the part of my script that is responsible for this multiplication (I've added the Aq and x into the script to show the contents of Aq and x as well):

disp('DOT PRODUCT OF ACTIVE SET AND NEW POINT: ')
Aq
x
Aq*x

Here is the result of the code above when run:

enter image description here

As you can see the value isn't exactly 0 even though it really should be. Note that this problem doesn't occur for all possible values of Aq and x. If Aq = [-1 -1 1 1] and x = [4/13 4/13 4/13 4/13] the result is exactly 0 as you can see below:

enter image description here

What is causing this inaccuracy? How can I fix this?

P.S. I didn't include my whole code because it's not very well documented and few hundred lines long, but I will if requested.

Thank you!

UPDATE: new test, by using Ander Biguri's advice:

enter image description here

UPDATE 2: THE CODE

function [weights, alphas, iters] = solveSVM(data, labels, C, e)
% FUNCTION [weights, alphas, iters] = solveSVM(data, labels, C, e)
% 
% AUTHOR: jjepsuomi
%
% VERSION: 1.0
%
% DESCRIPTION: 
% - This function will attempt to solve the optimal weights for a Support
% Vector Machines (SVM) model using active set method with gradient
% projection.
% 
% INPUTS: 
% "data" a n-by-m data matrix. The number of rows 'n' corresponds to the
% number of data points and the number of columns 'm' corresponds to the
% number of variables. 
% "labels" a 1-by-n row vector of data labels from the set {-1,1}. 
% "C" Box costraint upper limit. This will constrain the values of 'alphas'
% to the range 0 <= alphas <= C. If hard-margin SVM model is required set
% C=Inf. 
% "e" a real value corresponding to the convergence criterion, that is if
% solution Xi and Xi-1 are within distance 'e' from each other stop the
% learning process, i.e. IF |F(Xi)-F(Xi-1)| < e ==> stop learning process.
%
% OUTPUTS: 
% "weights" a vector corresponding to the optimal decision line parameters.
% "alphas" a vector of alpha-values corresponding to the optimal solution
% of the dual optimization problem of SVM.
% "iters" number of iterations until learning stopped.
%
% EXAMPLE USAGE 1: 
% 
% 'Hard-margin SVM': 
%
% data = [0 0;2 2;2 0;3 0];
% labels = [-1 -1 1 1];
% [weights, alphas, iters] = solveSVM(data, labels, Inf, 10^-100)
% 
% EXAMPLE USAGE 2:
%
% 'Soft-margin SVM':
%
% data = [0 0;2 2;2 0;3 0];
% labels = [-1 -1 1 1];
% [weights, alphas, iters] = solveSVM(data, labels, 0.8, 10^-100)


% STEP 1: INITIALIZATION OF THE PROBLEM
format long
% Calculate linear kernel matrix
L = kron(labels', labels);
K = data*data';
% Hessian matrix
Qd = L.*K;
% The minimization function 
L = @(a) (1/2)*a'*Qd*a - ones(1, length(a))*a;
% Gradient of the minimizable function 
gL = @(a) a'*Qd - ones(1, length(a));


% STEP 2: THE LEARNING PROCESS, ACTIVE SET WITH GRADIENT PROJECTION

% Initial feasible solution (required by gradient projection)
x = zeros(length(labels), 1);
iters = 1;
optfound = 0;

while optfound == 0 % criterion met    

    % Negative of the gradient at initial solution
    g = -gL(x);
    % Set the active set and projection matrix
    Aq = labels; % In plane y^Tx = 0
    P = eye(length(x))-Aq'*inv(Aq*Aq')*Aq; % In plane projection
    % Values smaller than 'eps' are changed into 0
    P(find(abs(P-0) < eps)) = 0;
    d = P*g'; % Projection onto plane

    if ~isempty(find(x==0 | x==C)) % Constraints active?
        acinds = find(x==0 | x==C);
        for i = 1:length(acinds)
            if (x(acinds(i)) == 0 && d(acinds(i)) < 0) || x(acinds(i)) == C && d(acinds(i)) > 0
                % Make the constraint vector
                constr = zeros(1,length(x));
                constr(acinds(i)) = 1;
                Aq = [Aq; constr];
            end
        end
        % Update the projection matrix
        P = eye(length(x))-Aq'*inv(Aq*Aq')*Aq; % In plane / box projection
        % Values smaller than 'eps' are changed into 0
        P(find(abs(P-0) < eps)) = 0;
        d = P*g'; % Projection onto plane / border
    end

    %%%% DISPLAY INFORMATION, THIS PART IS NOT NECESSAY, ONLY FOR DEBUGGING

    if Aq*x ~= 0
        disp('ACTIVE SET CONSTRAINTS Aq :')
        Aq
        disp('CURRENT SOLUTION x :')
        x
        disp('MULTIPLICATION OF Aq and x')
        Aq*x
    end

    %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

    % Values smaller than 'eps' are changed into 0
    d(find(abs(d-0) < eps)) = 0;
    if ~isempty(find(d~=0)) && rank(P) < length(x) % Line search for optimal lambda
        lopt = ((g*d)/(d'*Qd*d));       
        lmax = inf;   
        for i = 1:length(x)
            if d(i) < 0 && -x(i) ~= 0 && -x(i)/d(i) <= lmax
                lmax = -x(i)/d(i);
            elseif d(i) > 0 && (C-x(i))/d(i) <= lmax
                lmax = (C-x(i))/d(i);
            end
        end 
        lambda = max(0, min([lopt, lmax]));
        if abs(lambda) < eps
            lambda = 0;
        end
        xo = x;
        x = x + lambda*d;
        iters = iters + 1;
    end
    % Check whether search direction is 0-vector or 'e'-criterion met.
    if isempty(find(d~=0)) || abs(L(x)-L(xo)) < e
        optfound = 1;
    end
end

%%% STEP 3: GET THE WEIGHTS
alphas = x;
w = zeros(1, length(data(1,:)));
for i = 1:size(data,1)
    w = w + labels(i)*alphas(i)*data(i,:);
end
svinds = find(alphas>0); 
svind = svinds(1);
b = 1/labels(svind) - w*data(svind, :)';



%%% STEP 4: OPTIMALITY CHECK, KKT conditions. See KKT-conditions for reference.
weights = [b; w'];
datadim = length(data(1,:));
Q = [zeros(1,datadim+1); zeros(datadim, 1), eye(datadim)];
A = [ones(size(data,1), 1), data];
for i = 1:length(labels)
    A(i,:) = A(i,:)*labels(i);
end
LagDuG = Q*weights - A'*alphas;
Ac = A*weights - ones(length(labels),1);
alpA = alphas.*Ac;
LagDuG(any(abs(LagDuG-0) < 10^-14)) = 0;
if ~any(alphas < 0) && all(LagDuG == zeros(datadim+1,1)) && all(abs(Ac) >= 0) && all(abs(alpA) < 10^-6)
    disp('Optimal found, Karush-Kuhn-Tucker conditions satisfied.')
else
    disp('Optimal not found, Karush-Kuhn-Tucker conditions not satisfied.')
end

% VISUALIZATION FOR 2D-CASE
if size(data, 2) == 2 
    pinds = find(labels > 0);
    ninds = find(labels < 0);
    plot(data(pinds, 1), data(pinds, 2), 'o', 'MarkerFaceColor', 'red', 'MarkerEdgeColor', 'black')
    hold on
    plot(data(ninds, 1), data(ninds, 2), 'o', 'MarkerFaceColor', 'blue', 'MarkerEdgeColor', 'black')
    Xb = min(data(:,1))-1;
    Xe = max(data(:,1))+1;
    Yb = -(b+w(1)*Xb)/w(2);
    Ye = -(b+w(1)*Xe)/w(2);
    lineh = plot([Xb Xe], [Yb Ye], 'LineWidth', 2);
    supvh = plot(data(find(alphas~=0), 1), data(find(alphas~=0), 2), 'g.');
    legend([lineh, supvh], 'Decision boundary', 'Support vectors');
    hold off
end

NOTE:

If you run the EXAMPLE 1, you should get an output starting with the following:

enter image description here

As you can see, the multiplication between Aq and x don't produce value 0, even though they should. This is not a bad thing in this particular example, but if I have more data points with lots of decimals in them this inaccuracy becomes bigger and bigger problem, because the calculations are not exact. This is bad for example when I'm searching for a new direction vector when I'm moving towards the optimal solution in gradient projection method. The search direction isn't exactly the correct direction, but close to it. This is why I want the exactly correct values...is this possible?

I wonder if the decimals in the data points have something to do with the accuracy of my results. See the picture below:

enter image description here

So the question is: Is this caused by the data or is there something wrong in the optimization procedure...


Solution

  • Do you use format function inside your script? It looks like you used somewhere format rat.

    You can always use matlab eps function, that returns precision that is used inside matlab. The absolute value of -1/18014398509481984 is smaller that this, according to my Matlab R2014B:

    format long
    a = abs(-1/18014398509481984)
    b = eps
    a < b
    

    This basically means that the result is zero (but matlab stopped calculations because according to eps value, the result was just fine).

    Otherwise you can just use format long inside your script before the calculation.

    Edit

    I see inv function inside your code, try replacing it with \ operator (mldivide). The results from it will be more accurate as it uses Gaussian elimination, without forming the inverse.

    The inv documentation states:

    In practice, it is seldom necessary to form the explicit inverse of a matrix. A frequent misuse of inv arises when solving the system of linear equations Ax = b. One way to solve this is with x = inv(A)*b. A better way, from both an execution time and numerical accuracy standpoint, is to use the matrix division operator x = A\b. This produces the solution using Gaussian elimination, without forming the inverse.