I compared the following codes. Serial:
N = 500;
M = rand(500,500,N);
R = zeros(500,500,N);
tic
for k = 1:N
R(:,:,k) = inv(M(:,:,k));
end
toc
Parallel:
N = 500;
M = rand(500,500,N);
R = zeros(500,500,N);
tic
parfor k = 1:N
R(:,:,k) = inv(M(:,:,k));
end
toc
I get that the serial time is 3 times shorter than parallel time - though I have 4 available local cores that seem to be in use. Any thoughts on why is it happening?
Remember that many MATLAB operations (especially large linear algebra operations) are intrinsically multi-threaded. In this case, inv
is multi-threaded, and is the dominant factor in your for
loop. When you convert that to a parfor
loop, if you only have the 'local'
cluster type available, then you have no more computational cores available in parfor
than you did in for
. Therefore, the parfor
loop simply must be slower than the for
loop because it has to transmit data to the workers for them to operate on.
In general, if you have only 'local'
workers available, then parfor
can beat for
only when MATLAB cannot multi-thread the body of the for
loop.