I have a cell with arrays listed inside:
C = {[1,2,3,4], [3,4], [2], [4,5,6], [4,5], [7]}
I want to output:
D = {[3,4], [2], [4,5], [7]}
Those sets in D are the only sets that contain any other sets in D in themselves.
Please reference the following link for a similar question. Although elegant, I was not able to modify the code (yet) to accommodate my particular question.
I would appreciate any help with a solution.
Thank you!
As of the linked post you can form the matrix s
that represents the number of similar elements between all pairs of sets. The result would be:
C = {[1,2,3,4], [3,4], [2], [4,5,6], [4,5], [7]};
n = cellfun(@numel,C); % find length of each element.
v = repelem(1:numel(C),n); % generate indices for rows of the binary matrix
[~,~,u] = unique([C{:}]); % generate indices for rows of the binary matrix
b = accumarray([v(:),u(:)],ones(size(v)),[],@max,[],true); % generate the binary matrix
s = b * b.'; % multiply by its transpose
s(1:size(s,1)+1:end) = 0; % set diagonal elements to 0(we do not need self similarity)
result=C(~any(n(:) == s)) ;
But the matrix may be very large so it is better to use a loop to avoid memory problems:
idx=false(1,numel(C));
for k =1:numel(C)
idx(k) = ~any(n == full(s(k, :))) ;
end
result=C(idx) ;
Or follow a vectorized approach:
[r, c, v] = find(s) ;
idx = sub2ind(size(s), r, c) ;
s(idx) = v.' == n(r) ;
result = C(~any(s)) ;