Search code examples
matlabexponentiation

Why does MATLAB's element-wise exponentiation speed up for 512 elements?


MATLAB's power function to calculate element-wise exponential for a constant base and an array of exponents becomes noticeably faster when the size of the array becomes 512. I expected to see the computation time increase with the input size, however, there is a noticeable drop when there are 512 elements in the array of exponents. Here is a sample code

x_list = 510:514;
for i = 1:numel(x_list)
    x = x_list(i);
    tic
    for j = 1:10000
        y = power(2,1:x); 
    end
    toc
end

The output of the code is

Elapsed time is 0.397649 seconds.
Elapsed time is 0.403687 seconds.
Elapsed time is 0.318293 seconds.
Elapsed time is 0.238875 seconds.
Elapsed time is 0.175525 seconds.

What is happening here?

image


Solution

  • I see the same effect using random numbers for the exponent, as I see using integers in the range 1:n:

    x = 500:540;
    t = zeros(size(x));
    for ii = 1:numel(x)
        %m = 1:x(ii);
        m = 500*rand(1,x(ii));
        t(ii) = timeit(@()power(2,m));
    end
    plot(x,t)
    

    graph showing jump down in execution time around 512

    When forcing MATLAB to use a single thread with maxNumCompThreads(1), and running the code above again, I see this graph instead (note the y-axis, the peaks are just noise):

    graph not showing the jump at 512

    It looks to me that MATLAB uses a single core to compute the exponent of 511 values, and fires up all cores if the matrix is larger. There is an overhead in using multithreading, it is not worth while to do so for small arrays. The exact point where the overhead is balanced by the time savings depends on many factors, and so hard-coding a fixed threshold for when to switch to multithreaded computation leads to a jump in execution time on systems with different characteristics to those of the system where the threshold was determined.

    Note that @norok2 is not seeing this same jump because on their system MATLAB was limited to a single thread.