Search code examples
matlabclassificationpcafeature-selectionsupervised-learning

How to use pca function in MATLAB to select effective features?


I'm new in pca and after some researching I found that with pca algorithm we can select best effective features.

I just wanted to use pca function (in MATLAB) to select best features to classification data to two classes with labels "health" and "unhealthy" (supervised classification).

My question is that should I set some parameters on this function to do it or I should write codes by myself and pca function does not has this compatibility?.

As an example, I have a data set with 200 rows and 5 features that are:

1-Age 
2-Weight
3-Tall
4-Skin Color
5-Eye color 

and want to use "pca" function to find effective features (as an example):

1-Age
3-Tall 
5-Eye Color

to classification data (2 classes with labels "health" and "unhealthy").


Solution

  • % remove labels
    features=AllMyData(:,1:end-1);
    
    % get dimensions
    [m,n] = size(features);
    
    %# Remove the mean
    features = features - repmat(mean(features,2), 1, size(features,2));
    
    %# Compute the SVD
    [U,S,V] = svd(features);
    
    %# Compute the number of eigenvectors representing
    %#  the 95% of the variation
    coverage = cumsum(diag(S));
    coverage = coverage ./ max(coverage);
    [~, nEig] = max(coverage > 0.95);
    
    %# Compute the norms of each vector in the new space
    norms = zeros(n,1);
    for i = 1:n
      norms(i) = norm(V(i,1:nEig))^2;
    end
    
    [~, idx] = sort(norms);
    idx(1:n)'