Search code examples
arraysmatlabmatrix-indexing

Grouping values together based on equivalence


I have a Nx2 matrix (in this case 6X2) that is as follows:

5  4
8  6
9  8 
10 9
15 14
16 15

I would like to group them in the following manner:

*(5,4) is the 1st group. It is checked with the remaining rows. Since neither 5 or 4 does not appear in the remaining rows, it may be kept as it is.

*(8,6) is the next group. However in the 3rd row 8 appears again as (9,8). So it is connected to the 2nd group ie 2nd group becomes (8,6,9,8).

*Next (10,9) is seen.But 9 belongs to the 2nd group and 10 is also added. So 2nd group is now (8,6,9,8,10,9).

*Next (15,14) is seen. It has not appeared or repeated so far. So it becomes the 3rd group ie (15,14).

*Finally, (16,15) is seen. But since 15 has repeated in the 3rd group, the group is now updated as (15,14,16,15).

In the end, unique command in MATLAB may be used to remove redundant elements in the groups.

I would like the final output as No of groups:

5   4
8   6   9   10
15  14  16

Note: The order of the elements in the row-wise group is not important as (5,4) or (4,5) is the same. Also I guess, the final output maybe appended with zeros so as to make it a matrix of equal dimensions i.e

5   4   0   0 
8   6   9  10 
15 14  16   0

Kindly help. Thank you.


Solution

  • So, I feel that there should be a way to do this using tree-based algorithm or a disjoint-set/union-find type approach. But for a dirty first approach, which is likely not all that efficient, I coded the following:

    A = [5 4
         8 6
         9 8     
        10 9
        15 14
        16 15
         8 8
         3 4
         3 1
         ];
    B = mat2cell(A,ones(size(A,1),1),2);
    
    iter = 1;
    while(true)
        if (iter > size(B,1))
            break
        end
        y = B{iter};
        C = cat(1,false(iter,1),cellfun(@(x) any(sum(bsxfun(@eq,x,y'))),B(iter+1:end)));
        if any(C)
            B{iter} = (cat(2,B{iter}, B{C}));
            % or put you can start shortening them here, by calling unique() already:
            %B{iter} = unique(cat(2,B{iter}, B{C}));
            B(C) = [];
        else
            iter = iter+1;
        end
    end
    
    B = cellfun(@unique,B,'uni',false);
    B{:}
    

    It works out pretty alright. I don't know if this would work out for what you want, but it was fun to code up quickly. Thanks for the fun problem.