Search code examples
matlabvectorizationpercentileaccumarray

Improve code / remove for-loop when using accumarray MATLAB


I have the following piece of code that is quite slow to compute the percentiles from a data set ("DATA"), because the input matrices are large ("Data" is approx. 500.000 long with 10080 unique values assigned from "Indices").

Is there a possibility/suggestions to make this piece of code more efficient? For example, could I somehow omit the for-loop?

k = 1;
for i = 0:0.5:100; % in 0.5 fractile-steps
     FRACTILE(:,k) = accumarray(Indices,Data,[], @(x) prctile(x,i));
     k = k+1;
end

Solution

  • Calling prctile again and again with the same data is causing your performance issues. Call it once for each data set:

    FRACTILE=cell2mat(accumarray(Indices,Data,[], @(x) {prctile(x,[0:0.5:100])}));
    

    Letting prctile evaluate your 201 percentiles in one call costs roughly as much computation time as two iterations of your original code. First because prctile is faster this way and secondly because accumarray is called only once now.