Search code examples
matlabparallel-processingmatlabpool

Why parfor-loop spend more time than for-loop?


function test_parfor
N = 1e8;
sum_all = 0; % sum all numbers
sum_odd = 0; % sum odd numbers
sum_even = 0; % sum even numbers
tic;
parfor i = 1 : N
  sum_all = sum_all + i; % sum all numbers
  if mod(i,2)
      sum_odd = sum_odd + i; % sum odd numbers
  else
      sum_even = sum_even + i; % sum even numbers
  end %endif
end %endfor
toc;
fprintf('sum_all=%d,\nsum_odd=%d,\nsum_even=%d.\n', ...
    sum_all, sum_odd, sum_even);

I have initialized the parpool envionment and run the codes above. However, the parfor-loop took far more time than single for-loop. Futhermore, the numCores of my PC is 12, and I have initialized 12 workers before runing the function code. Why? What is wrong with my codes?

Thank you very much! :-)

In addition, the initializing code for parallel computing environment is as following.

function initpar(CoreNum)
%Initialize Matlab Parallel Computing Enviornment

if nargin==0
    CoreNum=feature('numCores');
end
if  isempty(gcp('nocreate'))
    clear ALL;
    parpool('local',CoreNum); % matlabpool in R2013
else
    disp('Parallel Computing Enviornment already initialized');
end

Solution

  • See this page:

    Parallel overhead. There is overhead in calling parfor instead of for. If function evaluations are fast, this overhead could become appreciable. In particular, solving a problem in parallel can be slower than solving the problem serially.

    The tip is not to use parfor when each iteration is not time-consuming; the limit of processing cost is of course dependant of your hardware.

    EDIT: if you remove the if-else block, the behavior of the loop changes. sum_all is now detected as a reduction variable, as it is explained in this page. The loop is then correctly broken into independent parts; the partial results are merged at the end. With the if-else block, sum_even and sum_odd are not considered as reduction variables (I think), so it will behave like a classic for-loop plus the parallel computing overhead.