Search code examples
matlabmatrixvectorizationbsxfun

Use of bsxfun with singleton expansion with matrixes of three dimensions


I'm using bsxfun to vectorize an operation with singleton expansion between matrixes of sizes:

MS: (nms, nls)
KS: (nks, nls)

The operation is the sum of the absolute differences between each value MS(m,l) with m in 1:nms and l in 1:nls, and every KS(k,l) with k in 1:nks.

I achieve this through the code:

[~, nls] = size(MS);
MS = reshape(MS',1,nls,[]);
R = sum(abs(bsxfun(@minus,MS,KS)));

R is of size (nls, nms).

I want to generalize this operation to a list of samples, so the new sizes will be:

MS: (nxs, nls, nms)
KS: (nxs, nls, nks)

This can be achieved easily with a for loop that executes the first piece of code for each 2 dimensional matrixes, but I suspect that performance may be much better by generalizing the previous code by adding a new dimension.

R has would be of size: (nxs, nls, nms)

I have tried to reshape MS to 4 dimensions with no success. Could this be done with reshaping and bsxfun?


Solution

  • You might need this:

    % generate small dummy data
    nxs = 2;
    nls = 3;
    nms = 4;
    nks = 5;
    MS = rand(nxs, nls, nms);
    KS = rand(nxs, nls, nks);
    
    R = sum(abs(bsxfun(@minus,MS,permute(KS,[1,2,4,3]))),4)
    

    This will produce a matrix of size [2,3,4], i.e. [nxs,nls,nms]. Each element [k1,k2,k3] will correspond to

    R(k1,k2,k3) == sum_k abs(MS(k1,k2,k3) - KS(k1,k2,k))
    

    For instance, in my random run

    R(2,1,3)
    
    ans =
    
       1.255765020150647
    
    >> sum(abs(MS(2,1,3)-KS(2,1,:)))
    
    ans =
    
       1.255765020150647
    

    The trick is to introduce singleton dimensions with permute: permute(KS,[1,2,4,3]) is of size [nxs,nls,1,nks], while MS of size [nxs,nls,nms] is implicitly also of size [nxs,nls,nms,1]: every array in MATLAB is assumed to possess a countably infinite number of trailing singleton dimensions. From here it's easy to see how you can bsxfun together arrays of size [nxs,nls,nms,1] and [nxs,nls,1,nks], respectively, to obtain one with size [nxs,nls,nms,nks]. Summing along dimension 4 seals the deal.


    I noted in a comment, that it might be faster to permute the summing index to be in the first place. Turns out that this by itself makes the code run slower. However, by reshaping the arrays to have decreasing dimension sizes, the overall performance increases (due to optimal memory access). Compare this:

    % generate larger dummy data
    nxs = 20;
    nls = 30;
    nms = 40;
    nks = 500;
    MS = rand(nxs, nls, nms);
    KS = rand(nxs, nls, nks);
    
    MS2 = permute(MS,[4 3 2 1]);
    KS2 = permute(KS,[3 4 2 1]);
    R3 = permute(squeeze(sum(abs(bsxfun(@minus,MS2,KS2)),1)),[3 2 1]);
    

    What I did was put the summing nks dimension into first place, and order the rest of the dimensions in decreasing order. This could be done automatically, I just didn't want to overcomplicate the example. In your use case you'll probably know the magnitude of the dimensions anyway.

    Runtimes with the above two codes: 0.07028 s for the original, 0.051162 s for the reordered one (best out of 5). Larger examples don't fit into memory for me now, unfortunately.