Search code examples
matlabmatrixduplicatessparse-matrixduplicate-data

How can I find indices of each row of a matrix which has a duplicate in matlab?


I want to find the indices all the rows of a matrix which have duplicates. For example

A = [1 2 3 4
     1 2 3 4
     2 3 4 5
     1 2 3 4
     6 5 4 3]

The vector to be returned would be [1,2,4]

A lot of similar questions suggest using the unique function, which I've tried but the closest I can get to what I want is:

[C, ia, ic] = unique(A, 'rows')

ia = [1 3 5]
m = 5;
setdiff(1:m,ia) = [2,4]

But using unique I can only extract the 2nd,3rd,4th...etc instance of a row, and I need to also obtain the first. Is there any way I can do this?

NB: It must be a method which doesn't involve looping through the rows, as I'm dealing with large sparse matrices.


Solution

  • How about:

    [~, ia, ic] = unique(A, 'rows')
    
    setdiff(1:size(A,1), ia( sum(bsxfun(@eq,ic,(1:max(ic))))<=1 ))