I want to calculate cosine similarity between different rows of a matrix in matlab. I wrote the following code in matlab:
for i = 1:n_row
for j = i:n_row
S2(i,j) = dot(S1(i,:), S1(j,:)) / (norm_r(i) * norm_r(j));
S2(j,i) = S2(i,j);
matrix S1 is 11000*11000 and the code execution is very time consuming. So, I want to know Is there any function in matlab to calculate the cosine similarity between matrix rows faster than the above code?
Your code loops over all rows, and for each row loops over (about) half the rows, computing the dot product for each unique combination of rows:
n_row = size(S1,1);
norm_r = sqrt(sum(abs(S1).^2,2)); % same as norm(S1,2,'rows')
S2 = zeros(n_row,n_row);
for i = 1:n_row
for j = i:n_row
S2(i,j) = dot(S1(i,:), S1(j,:)) / (norm_r(i) * norm_r(j));
S2(j,i) = S2(i,j);
end
end
(I've taken the liberty to complete your code so it actually runs. Note the initialization of S2
before the loop, this saves a lot of time!)
If you note that the dot product is a matrix product of a row vector with a column vector, you can see that the above, without the normalization step, is identical to
S2 = S1 * S1.';
This runs much faster than the explicit loop, even if it is (maybe?) not able to use the symmetry. The normalization is simply dividing each row by norm_r
and each column by norm_r
. Here I multiply the two vectors to produce a square matrix to normalize with:
S2 = (S1 * S1.') ./ (norm_r * norm_r.');