arrays matlab matrix vectorization matrix-multiplication

Multiply two matrices column-wise to obtain vector

I have two equally-sized matrices A and B, which are actually a collection of independent column vectors. I need to perform a "matrix-multiplication" (* or mtimes) for every column in A with the corresponding column in B. As a result I get a vector whose number of elements equals the number of columns. The easy way is a loop, but I need to this thousands of times and my matrices are at least of size >2000x16...200 - so vectorization would help.

Additionally, there is also a weighting matrix involved.

Consider the following example:

nDof  = 250;    % number of rows
nCols = 16;     % number of columns

% Example complex matrices 
A = rand(nDof,nCols) + 1j*rand(nDof,nCols); 
B = rand(nDof,nCols) + 1j*rand(nDof,nCols); 
% Weighting Matrix
W = rand(nDof,nDof)  + 1j*rand(nDof,nDof); 

% Option 1: Loop - multpliy matrices column by column with/without wighting
X1 = zeros(nCols,1);  Y1 = zeros(nCols,1);
for ii = 1:nCols
    X1(ii) = A(:,ii)'     * B(:,ii);
    Y1(ii) = A(:,ii)' * W * B(:,ii);
end

% Option 2: avoid loop
X2temp = A'     * B;
Y2temp = A' * W * B;
% => seemingly diagonal elements are what I need
X2 = diag(X2temp);
Y2 = diag(Y2temp);
    
% Test
disp(X2-X1)
disp(Y2-Y1)

As you can see, in Option 2, I tried to avoid the loop by just multiplying the matrices directly. The diagonal elements of the resulting matrix are very close to the result of the loop. But only close and not equal. Although the error is probably significantly small for my case, it appears to be too large to be caused only by floating-point inaccuracies.

Test results:

X2-X1  =  1.0e-13 *

  0.426325641456060 + 0.355271367880050i
  0.284217094304040 + 0.852651282912120i
 -0.284217094304040 - 0.426325641456060i
  0.000000000000000 + 0.142108547152020i
 -0.284217094304040 - 0.213162820728030i
  0.284217094304040 - 0.923705556488130i
  0.568434188608080 + 0.355271367880050i
  0.000000000000000 + 0.142108547152020i
  0.000000000000000 + 0.071054273576010i
 -0.284217094304040 + 0.497379915032070i
  0.284217094304040 - 0.426325641456060i
 -0.284217094304040 + 0.142108547152020i
  0.000000000000000 + 0.000000000000000i
 -0.284217094304040 + 0.142108547152020i
 -0.284217094304040 - 0.213162820728030i
  0.284217094304040 - 0.284217094304040i

Y2-Y1 = 1.0e-10 *

  0.000000000000000 + 0.036379788070917i
 -0.018189894035459 + 0.072759576141834i
 -0.072759576141834 - 0.072759576141834i
  0.036379788070917 - 0.109139364212751i
  0.090949470177293 + 0.054569682106376i
 -0.109139364212751 + 0.054569682106376i
  0.109139364212751 - 0.163709046319127i
  0.018189894035459 - 0.127329258248210i
  0.072759576141834 - 0.018189894035459i
  0.036379788070917 - 0.036379788070917i
  0.018189894035459 + 0.000000000000000i
  0.109139364212751 - 0.072759576141834i
 -0.018189894035459 - 0.018189894035459i
  0.000000000000000 - 0.054569682106376i
 -0.018189894035459 + 0.036379788070917i
  0.036379788070917 - 0.036379788070917i

So. What is the correct way to do this? What is the fastest way to do this?

Thank you!

Solution

You can do element-wise products and then do a sum along the first axis:

X = sum(conj(A) .* B, 1);
Y = sum(conj(A) .* (W*B), 1);

Edit: Actually, there is a built-in function doing exactly that called dot :

X = dot(A, B);
Y = dot(A, W*B);

Transpose (.') or adjoint (') the results, if you want to have them as column vectors.

I'm guessing that dot will tend to be more accurate that the sum( .* ) variant, mostly because it can better use FMA operations. If you want to check the accuracy, I suggest to generate random single precision data, then compute the dot-products with either approach in single precision and compare it to what you get if you convert it to double precision first.