Search code examples
matlabmatrixmemoryoptimizationparfor

Huge broadcast variable, optimizing code without parfor?


I have a 40000 by 80000 matrix from which I'm obtaining the number of "clusters" (groups of elements with the same value that are adjacent to one another) and then calculating the size of each of those clusters. Here it is the chunk of code.

FRAGMENTSIZESCLASS = struct([]);  %We store the data in a structure
for class=1:NumberOfClasses
  %-First we create a binary image for each class-%
  BWclass = foto==class;
  %-Second we calculate the number of connected components (fragments)-%
  L = bwlabeln(BWclass);          %returns a label matrix, L, containing labels for the connected components in BWclass
  clear BWclass
  NumberFragments=max(max(L));
  %-Third we calculate the size of each fragment-%
  FragmentSize=zeros(NumberFragments,1);
  for f=1:NumberFragments      % potential improvement: using parfor while saring the memory between workers
    FragmentSize(f,1) = sum(L(:) == f);
  end
  FRAGMENTSIZESCLASS{class}=FragmentSize;
  clear L
end

The problem is that the matrix L is so large that if I use a parfor loop it turns into a broadcast variable and then the memory gets multiplied and I run out of memory.

Any ideas on how to sort this out? I've seen this file: https://ch.mathworks.com/matlabcentral/fileexchange/28572-sharedmatrix but is not an straightforward solution, even though I have 24 cores still will take a lot of time.

Cheers!


Here it is a picture showing the time it takes as a function of image size when using the code I posted in the question vs using bwconncomp as suggested by @bla: enter image description here


Solution

  • instead of bwlabeln use the built in function bwconncomp, for example:

    ...
    s=bwconncomp(BWClass);
    fragmentsize=sum(cellfun(@numel,s.PixelIdxList));
    ....