Search code examples
matlabcell-array

Count unique rows in a cell full of vectors


I have a cell in MATLAB where each element contains a vector of a different length

e.g.

C = {[1 2 3], [2 4 5 6], [1 2 3], [6 4], [7 6 4 3], [4 6], [6 4]}

As you can see, some of the the vectors are repeated, others are unique.

I want to count the number of times each vector occurs and return the count such that I can populate a table in a GUI where each row is a unique combination and the date shows how many times each combination occurs.

e.g.

            Count
"[1 2 3]"     2
"[6 4]"       2
"[2 4 5 6]"   1
"[7 6 4 3]"   1
"[4 6]"       1

I should say that the order of the numbers in each vector is important i.e. [6 4] is not the same as [4 6].

Any thoughts how I can do this fairly efficiently?

Thanks to people who have commented so far. As @Divakar kindly pointed out, I forgot to mention that the values in the vector can be more than one digit long. i.e. [46, 36 28]. My original code would concatenate the vector [1 2 3 4] into 1234 then use hist to do the counting. Of course this falls apart when you got above single digits as you can tell the difference between [1, 2, 3, 4] and [12, 34].


Solution

  • You can convert all the entries to char and then to a 2D numeric array and finally use unique(...'rows') to get labels for unique rows and use them to get their counts.

    C = {[46, 36 28], [2 4 5 6], [46, 36 28], [6 4], [7 6 4 3], [4 6], [6 4]} %// Input
    
    char_array1 = char(C{:})-0; %// convert input cell array to a char array
    [~,unqlabels,entry_labels] = unique(char_array1,'rows'); %// get unique rows
    count = histc(entry_labels,1:max(entry_labels)); %// counts of each unique row
    

    For the purpose of presenting the output in a format as asked in the question, you can use this -

    out = [C(unqlabels)' num2cell(count)];
    

    Output -

    out = 
        [1x4 double]    [1]
        [1x2 double]    [1]
        [1x2 double]    [2]
        [1x4 double]    [1]
        [1x3 double]    [2]
    

    and display the unique rows with celldisp -

    ans{1} =
         2     4     5     6
    ans{2} =
         4     6
    ans{3} =
         6     4
    ans{4} =
         7     6     4     3
    ans{5} =
        46    36    28
    

    Edit: If you have negative numbers in there, you need to do little more work to setup char_array1 as shown here and rest of the code stays the same -

    lens = cellfun(@numel,C);
    mat1(max(lens),numel(lens))=0;
    mat1(bsxfun(@ge,lens,[1:max(lens)]')) = horzcat(C{:});
    char_array1 = mat1';