Search code examples
matlabfeature-selectionmatlab-table

Sequentialfs using MATLAB Tables


I previously used nothing but large matrices as arguments within the sequentialfs function within MATLAB. I have a newly upgraded MATLAB which contains the Table data type -- very handy. I attempted to rework a script which performs sequential feature selection utilizing the table set but have ran into trouble.

normfmat = ngmft(:,4:end-1); % ngmft is previously loaded data table
y = gmft(:,2); % categorical variable with two classes

c = cvpartition(y,'k',10); % first error produced here

fun = @(trainData,trainClass,testData,testClass)...
  (sum(~strcmp(testClass,predict(ClassificationKNN.fit(trainData,trainClass,'NumNeighbors',1),testData))));

[fs,history] = sequentialfs(fun,X,y,'cv',c) % second error here

The first error produced is

Error using statslib.internal.grp2idx (line 44) You cannot subscript a table using only one subscript. Table subscripting >requires both row and variable subscripts.

Error in grp2idx (line 28) [varargout{1:nargout}] = statslib.internal.grp2idx(s);

Error in cvpartition (line 164) cv.Group = grp2idx(N);

Error in script (line 32) c = cvpartition(group,'k',10);

This error goes away if I convert the classlab to a categorical array, but then a second error is produced at the sequentialfs call:

Error using sequentialfs (line 345) All input arguments must be tables.

So my question is, essentially, how does one utilize tables with the sequential feature selection process? In particular, the first error confuses me because I am feeding it a table with specified indices. For the second error, cvpartition returns a cvpartition object and y has been converted to a categorical array. The first was never a table and in the second I appear to be locked into due to the first error generated.


Solution

  • Using () indexing on a table returns a subset of the table but it's still a table and therefore is going to result in errors if you try to pass it to functions which expect a numeric array.

    If you simply want the values from the table, you'll want to use {} indexing instead.

    t = table([1 2 3].', [4 5 6].');
    
    %       Var1    Var2
    %       ____    ____
    %   
    %       1       4   
    %       2       5   
    %       3       6   
    
    class(t(1,:))
    
    %   table
    
    disp(t(1,:))
    
    %   Var1    Var2
    %   ____    ____
    %
    %   1       4   
    
    class(t{1,:})
    
    %   double
    
    disp(t{1,:})
    
    %   1     4
    

    More information on access data within a table

    So looking back at your specific example, you likely want to pass an array (not a table) to cvpartition to prevent the first error

    c = cvpartition(gmft{:,2});
    

    For the call to sequentialfs, you haven't shown us what X is but I would assume it's a table. If you fix that first error, the sequentialfs call shouldn't complain since both y and X would be tables.