parallel-processing scilab differential-evolution

Optimization doesn't converge with parallel_run function in Scilab

I'm trying to perform an optimization on Scilab, and I want to run a differential evolution code in parallel, using the parallel_run function.

Original version of the code includes a for loop for the part that I want to parallelize and it works just fine. When I modify the code with the parallel_run function and run it on a Windows machine it works again, but as far as I know this function is not for Windows and it only runs on a single core. Finally I tried to run the modified code on a Linux machine with no errors again, however optimization didn't converge and gave me a much worse final result.

While trying to figure out the problem, I realized that part of the things that the code prints were on Scilab console and the other part was on the terminal. And even though some calculations took place in the terminal, the optimization process couldn't retrieve data from there.

This is the for loop from the original version of the code:

//-----Select which vectors are allowed to enter the new population------------
  for i=1:NP
    tempval = fct(ui(:,i),y);   // check cost of competitor
    nfeval  = nfeval + 1;
    if (tempval <= val(i))  // if competitor is better than value in "cost array"
       pop(:,i) = ui(:,i);  // replace old vector with new one (for new iteration)
       val(i)   = tempval;  // save value in "cost array"

       //----we update optval only in case of success to save time-----------
       if (tempval < optval)     // if competitor better than the best one ever
          optval = tempval;      // new best value
          optarg = ui(:,i);      // new best parameter vector ever
       end;
    end;
  end; //---end for imember=1:NP

And here is the nested function I'm using to replace the loop, and trying to make it run in parallel:

function gpar(i);
  disp("called on "+string(i));
  global nfeval
  global val
  global pop
  global optval
  global optarg
  tempval = fct(ui(:,i),y);   // check cost of competitor
  nfeval  = nfeval + i;
  disp("tempval "+string(tempval));
  disp("val(i) "+string(val(i)));
  disp("optval "+string(optval));
  disp("bef_pop(i) "+string(pop(:,i)));
  if (tempval <= val(i))  // if competitor is better than value in "cost array"
     pop(:,i) = ui(:,i);  // replace old vector with new one (for new iteration)
     val(i)   = tempval;  // save value in "cost array"

     //----we update optval only in case of success to save time-----------
     if (tempval < optval)     // if competitor better than the best one ever
        optval = tempval;      // new best value
        optarg = ui(:,i);      // new best parameter vector ever
     end;
  end;
  disp("aft_pop(i) "+string(pop(:,i)));

endfunction; //---end for imember=1:NP

  parallel_run(1:NP, "gpar"); //calling function gpar in parallel
  disp("popThisGen "+string(pop)); //display the population after changes

Here i th column of pop matrix is changed if some condition is met. And I'm printing that i th column before and after the if statements. i is a vector that runs from 1 to 40. I can see first 10 of these prints from the Scilab console, and last 30 from the terminal (my machine has four cores, I think it has something to do with that). Then I'm printing the whole pop matrix after parallel run completes its job. From this final version of pop I realized that only the changes I observed at the Scilab console took effect, and none of the changes I observed on the terminal.

The complete version of the original code is at http://www1.icsi.berkeley.edu/~storn/code.html#scil

I think these losses are the reason for bad results I'm getting. Does anyone have an idea about what might be going on? Thanks.

Solution

Answering in kind of a hurry, but you should export the result through result variables. As stated in the parallel_run docs:

Furthermore, no locking primitives are available in Scilab to handle concurrent accesses to shared variables. For this reason, the concurrent execution of Scilab macros cannot be safe in a shared memory environment and each parallel execution must be done in a separate memory space. As a result, one should not rely on side effects such as modifying variables from outer scope : only the data stored into the result variables will be copied back into the calling environment.

Also check that all parallel calculations are completely independent. Get in all inputs through parameters.

Looking at your code, all things declared as global would be shared between all the processes. So place them all in the arguments list. You shouldn't assume any order of execution of the separate calculations.

The examples in the Scilab docs are actually pretty good in explaining the return values. For instance the following example.

function [r_min, r_med, r_max]=min_med_max(a, b, c)
  r_min=min(a,b,c); r_med=median([a,b,c]); r_max=max(a,b,c);
endfunction

N=10;
A=rand(1:N);B=rand(1:N);C=rand(1:N);

[Min,Med,Max]=parallel_run(A,B,C,"min_med_max");

Next to using the results, SciLab is very strict in how they are defined, only column vectors etc.

Also checkout the Scilab Wiki about parallel computing.

Hope this helps you.