I have a cell M with n number of cells, with each cell containing several unique numbers, something like this:
{[15 16 21 26 28 145],[2 5 8 9 15],[20 24 27],[10 11 15 8 6 258 74 1],...}
Some of these values appear in more than 1 cell. I would like to calculate the fraction of overlapping values across these cells. For instance, with the 4 cells above, I have 19 unique numbers, and 2 of them belong to more than 1 cell: 15 and 8. Thus, the fraction of overlapping cell is 2/19 = .105. Note that the number of cells in M can vary and thus the number of unique numbers in M also vary as well. Does anyone have any suggestion on how to do this efficiently? I've tried horzcat
to concatenate the cells within M then used unique
but didn't quite get what I want.
Using the output of the hist() function is useful here.
M = {[15 16 21 26 28 145],[2 5 8 9 15],[20 24 27],[10 11 15 8 6 258 74 1]};
% Bring data to one matrix.
M2 = cell2mat(M);
% Build a histogram from the data with a bin on each unique element. The
% first output of hist is the number of elements in each bin.
a = hist(M2,unique(M2));
% Calculate the overlap by dividing the number of elements that occur more
% than once by the total number of elements.
overlap = sum(a>1)/numel(a);