Search code examples
matlabplotellipsenormal-distributionstandard-deviation

Ellipse around the data in MATLAB


I would like to reproduce the following figure in MATLAB:

exampleee.png

There are two classes of points with X and Y coordinates. I'd like to surround each class with an ellipse with one parameter of standard deviation, which determine how far the ellipse will go along the axis.

The figure was created with another software and I don't exactly understand how it calculates the ellipse.

Here is the data I'm using for this figure. The 1st column is class, 2nd - X, 3rd - Y. I can use gscatter to draw the points itself.

A = [
    0   0.89287 1.54987
    0   0.69933 1.81970
    0   0.84022 1.28598
    0   0.79523 1.16012
    0   0.61266 1.12835
    0   0.39950 0.37942
    0   0.54807 1.66173
    0   0.50882 1.43175
    0   0.68840 1.58589
    0   0.59572 1.29311
    1   1.00787 1.09905
    1   1.23724 0.98834
    1   1.02175 0.67245
    1   0.88458 0.36003
    1   0.66582 1.22097
    1   1.24408 0.59735
    1   1.03421 0.88595
    1   1.66279 0.84183
];

gscatter(A(:,2),A(:,3),A(:,1))

FYI, here is the SO question on how to draw ellipse. So, we just need to know all the parameters to draw it.


Update:

I agree that the center can be calculated as the means of X and Y coordinates. Probably I have to use principal component analysis (PRINCOMP) for each class to determine the angle and shape. Still thinking...


Solution

  • Consider the code:

    %# generate data
    num = 50;
    X = [ mvnrnd([0.5 1.5], [0.025 0.03 ; 0.03 0.16], num) ; ...
          mvnrnd([1 1], [0.09 -0.01 ; -0.01 0.08], num)   ];
    G = [1*ones(num,1) ; 2*ones(num,1)];
    
    gscatter(X(:,1), X(:,2), G)
    axis equal, hold on
    
    for k=1:2
        %# indices of points in this group
        idx = ( G == k );
    
        %# substract mean
        Mu = mean( X(idx,:) );
        X0 = bsxfun(@minus, X(idx,:), Mu);
    
        %# eigen decomposition [sorted by eigen values]
        [V D] = eig( X0'*X0 ./ (sum(idx)-1) );     %#' cov(X0)
        [D order] = sort(diag(D), 'descend');
        D = diag(D);
        V = V(:, order);
    
        t = linspace(0,2*pi,100);
        e = [cos(t) ; sin(t)];        %# unit circle
        VV = V*sqrt(D);               %# scale eigenvectors
        e = bsxfun(@plus, VV*e, Mu'); %#' project circle back to orig space
    
        %# plot cov and major/minor axes
        plot(e(1,:), e(2,:), 'Color','k');
        %#quiver(Mu(1),Mu(2), VV(1,1),VV(2,1), 'Color','k')
        %#quiver(Mu(1),Mu(2), VV(1,2),VV(2,2), 'Color','k')
    end
    

    screenshot


    EDIT

    If you want the ellipse to represent a specific level of standard deviation, the correct way of doing is by scaling the covariance matrix:

    STD = 2;                     %# 2 standard deviations
    conf = 2*normcdf(STD)-1;     %# covers around 95% of population
    scale = chi2inv(conf,2);     %# inverse chi-squared with dof=#dimensions
    
    Cov = cov(X0) * scale;
    [V D] = eig(Cov);
    

    OP_DATA