Search code examples
performancematlabvectorizationflops

Really slow loop with vector-scalar multiplication in MATLAB


Have I done something wrong or is vector-by-scalar multiplication really so costly? Doesn't MATLAB (ver 2012a or higher) optimize the code somehow to prevent such curiosities?

>> tic; for i=1:100000; x = sin(i)*[1; 1]; end; toc;
Elapsed time is 1.338225 seconds.
>> tic; for i=1:100000; x = sin(i).*[1; 1]; end; toc;
Elapsed time is 1.228331 seconds.
>> tic; for i=1:100000; x = [sin(i); sin(i)]; end; toc;
Elapsed time is 0.073888 seconds.
>> tic; for i=1:100000; tmp=sin(i); x = [tmp; tmp]; end; toc;
Elapsed time is 0.072120 seconds.

What guidelines could you give me to make FLOPS in MATLAB take the time they really need.

PS. This is a sample code only, what I do is solving systems of odes and I want to optimize runtime when calculating needed differentials. The above is giving me worries that I might be doing something in a non-optimal way.


Solution

  • "Doesn't MATLAB (ver 2012a or higher) optimize the code somehow to prevent such curiosities?"

    Yes, it does if the code is within a function m-file, due to the JIT compiler (Just in time compiler) and/or accelerator

    However as mentioned in comments and other answers vectorisation is still generally a better option if possible

    Straight on command line:

    tic; for i=1:100000; x1 = sin(i)*[1; 1]; end; toc;
    tic; for i=1:100000; x2 = sin(i).*[1; 1]; end; toc;
    tic; for i=1:100000; x3 = [sin(i); sin(i)]; end; toc;
    tic; for i=1:100000; tmp=sin(i); x4 = [tmp; tmp]; end; toc;
    Elapsed time is 1.795528 seconds.
    Elapsed time is 1.606081 seconds.
    Elapsed time is 0.072672 seconds.
    Elapsed time is 0.065904 seconds.
    

    Within a function;

    [x1,x2,x3,x4]=foo();
    Elapsed time is 0.029698 seconds.
    Elapsed time is 0.035248 seconds.
    Elapsed time is 0.064080 seconds.
    Elapsed time is 0.054499 seconds.
    

    with the function foo saved as:

    function [x1,x2,x3,x4]=foo()
    
    tic; for i=1:100000; x1 = sin(i)*[1; 1]; end; toc;
    tic; for i=1:100000; x2 = sin(i).*[1; 1]; end; toc;
    tic; for i=1:100000; x3 = [sin(i); sin(i)]; end; toc;
    tic; for i=1:100000; tmp=sin(i); x4 = [tmp; tmp]; end; toc;
    
    end
    

    Edit

    While trying to find documentation to support the claims above I realised I had made a mistake it also accelerates script m-files, hence function being redacted above

    Within a script;

    fooscript;
    Elapsed time is 0.033536 seconds.
    Elapsed time is 0.033720 seconds.
    Elapsed time is 0.066050 seconds.
    Elapsed time is 0.058428 seconds.
    

    with the script fooscript containing:

    tic; for i=1:100000; x1 = sin(i)*[1; 1]; end; toc;
    tic; for i=1:100000; x2 = sin(i).*[1; 1]; end; toc;
    tic; for i=1:100000; x3 = [sin(i); sin(i)]; end; toc;
    tic; for i=1:100000; tmp=sin(i); x4 = [tmp; tmp]; end; toc;
    

    Sadly there is not a huge amount of documentation on JIT and accelerator (if any). However for comparison you can disable JIT or acceleration using feature('accel','on'/'off') and feature('jit','on'/'off'). (note: disabling accel also disables jit as it seems it is a part of accel.)

    The performance improvement is reduced if accel is disabled however both function and script performance are still similar and both still noticeably faster than command line.

    Disabling JIT had no noticeable effect on performance so the original statement was wrong.