Search code examples
matlabmatrixrepeatmatrix-indexing

Rows without repetitions - MATLAB


I have a matrix (4096x4) containing all possible combinations of four values taken from a pool of 8 numbers.

...
3    63    39     3
3    63    39    19
3    63    39    23
3    63    39    39
...

I am only interested in the rows of the matrix that contain four unique values. In the above section, for example, the first and last row should be removed, giving us -

...
3    63    39    19
3    63    39    23
...

My current solution feels inelegant-- basically, I iterate across every row and add it to a result matrix if it contains four unique values:

result = [];
for row = 1:size(matrix,1)
    if length(unique(matrix(row,:)))==4
        result =  cat(1,result,matrix(row,:));
    end
end

Is there a better way ?


Solution

  • Approach #1

    diff and sort based approach that must be pretty efficient -

    sortedmatrix = sort(matrix,2)
    result = matrix(all(diff(sortedmatrix,[],2)~=0,2),:)
    

    Breaking it down to few steps for explanation

    1. Sort along the columns, so that the duplicate values in each row end up next to each other. We used sort for this task.
    2. Find the difference between consecutive elements, which will catch those duplicate after sorting. diff was the tool for this purpose.
    3. For any row with at least one zero indicates rows with duplicate rows. To put it other way, any row with no zero would indicate rows with no duplicate rows, which we are looking to have in the output. all got us the job done here to get a logical array of such matches.
    4. Finally, we have used matrix indexing to select those rows from matrix to get the expected output.

    Approach #2

    This could be an experimental bsxfun based approach as it won't be memory-efficient -

    matches = bsxfun(@eq,matrix,permute(matrix,[1 3 2]))
    result = matrix(all(all(sum(matches,2)==1,2),3),:)
    

    Breaking it down to few steps for explanation

    1. Find a logical array of matches for every element against all others in the same row with bsxfun.
    2. Look for "non-duplicity" by summing those matches along dim-2 of matches and then finding all ones elements along dim-2 and dim-3 getting us the same indexing array as had with our previous diff + sort based approach.
    3. Use the binary indexing array to select the appropriate rows from matrix for the final output.

    Approach #3

    Taking help from MATLAB File-exchange's post combinator and assuming you have the pool of 8 values in an array named pool8, you can directly get result like so -

    result = pool8(combinator(8,4,'p'))
    

    combinator(8,4,'p') basically gets us the indices for 8 elements taken 4 at once and without repetitions. We use these indices to index into the pool and get the expected output.