I'm trying to implement some sort of interpolation algorithm. I
is an N*4 matrix, holding indices of surrounding points of N
other points. But elements in each row of I
may not be unique, meaning two or more of them may refer to an individual point. I want to know how many unique indices are available in each row, and I want to do it as fast as possible since N
is big!
Well, Mohsen's answer is a general solution for this problem, but arrayfun
was too slow for me. So I thought a little more about it and found a much faster solution.
I compare all pairs of columns and increase a counter if they were equal:
tic;
S = zeros(N, 1, 'uint32');
Nu = S+4; % in my case most of point are surrounded by four different points
for i=1:3
for j=(i+1):4
S = S + uint32(I(:, i)==I(:, j));
end
end
% Nu(S==0) = 4;
Nu(S==1) = 3;
Nu((S==2)|(S==3)) = 2; % why? :)
Nu(S==6) = 1;
toc;
For N=189225
, arrayfun
takes 14.73s on my PC but summation takes only 0.04s.
Edit: Take care of different numbers of columns
Here's a modification of the code above. Now we can also have the places of unique values in each row! This one hasn't the :)
problem and can be used for higher numbers of columns. Still taking 0.04s on my PC for 189225 rows.
tic;
uniq = true(N, 4);
for i=1:3
for j=(i+1):4
uniq(I(:, i)==I(:, j), j) = false;
end
end
Nu = sum(uniq, 2);
toc;
Edit(2): Comparison with EBH's answer
After a while I needed this for another problem where I wanted number of unique elements in each row of matrices with different numbers of columns. So I compared my code with EBH's to see if their code is faster. I ran both codes on matrices with rows from 10K to 100K, and columns from 6 to 60. The results are average of spent time (in seconds) of 3 different runs:
I'm testing this in 2016a and there has been a significant improvement in performance of for-loops in latest versions of MATLAB. So you may need to to compare it yourself if you want to run it in older versions.