python optimization scipy differential-evolution

Tracking the progress and convergence of optimization

I have an optimization problem with 36 variables.

I am using differential_evolution from scipy, I specify popsize=15, recombination=0.1, mutation=(0.9,1), strategy="best1bin", worker=1. I also print out intermediate results using callback. For example:

differential_evolution step 406: f(x)= 3182.8
[ 0.26630986  0.5777863   0.45482344  0.66859366  0.43145427  0.80049072  0.94042629  0.74227126  0.76609957  0.39421997  0.32598765  0.3198547   0.11332792  0.47842094
  0.41399388  0.16792322  0.24380912  0.74006486 -0.05026967  0.25951122  0.06112132  0.10108828  0.10494785 -0.38002972  0.1290126  -0.03676165  0.22189568  0.2424713
 -0.07450294 -0.52599295  0.1081112   0.41711925  0.12989419  0.63397135 -0.75419418 -0.38192472] convergence:  0.0

Is population evaluated sequentially if with worker=1, and how does it relate to convergence in callback?

Specifically, how should I understand the results here? It's already at step of 406, and convergence is still 0.0. Can I see it as no sign of convergence after evaluations of 406 populations? But the callback has reported this value of fun since step 321?

Solution

Is population evaluated sequentially if with worker=1

Yes, although if you use the vectorized option (see documentation) you can calculate the objective function for many members of the population in one vectorized call, which may be faster than looping in Python.

and how does it relate to convergence in callback? Can I see it as no sign of convergence after evaluations of 406 populations?

It depends. Here is how convergence is calculated, and then we'll see how it can be 0.

The value of convergence that gets passed to the callback is the relative tolerance tol divided by the internal measure of convergence, self.converged (source)
The internal measure of convergence is the standard deviation of population energies divided by the absolute value of the mean (source).
The population energies are the objective function value for each member of the population (source).

So if the function value is very spread out among members of the population, you can get very small convergence.

However, to be exactly zero, one of three things must happen:

relative tolerance tol has to be 0,
population standard deviation is so large relative to the mean that the converged calculation underflows, or
one of your population energies is infinite (source).

I would suggest taking a look at the individual energies (objective function values) to get an idea of which is the case (the last, if I were to guess). If your callback function has the signature

callback(intermediate_result: OptimizeResult)

in which the single argument is named intermediate_result, it receives an intermediate OptimizeResult object with attribute population_energies, which will allow you to see all the individual population energies (source).

Based on your comment, it sounds like at least one feasible solution is found, but some members of the population have not found a feasible solution after many iterations. Consequently, their associated energy is infinite, and convergence is still exactly zero.

If you're sure the constraints are correct and you can't loosen them to make them easier to satisfy, you have a few options.

Instead of using the constraints feature, evaluate the constraints yourself, and add a penalty component to the objective function that penalizes constraint violation. This would allow you to see a nonzero convergence number even when some members of the population violate constraints. However, population members that don't satisfy the constraints may still have very high energies relative to those that do, causing convergence to remain very low, so I wouldn't necessarily recommend this approach.
Use the information available in intermediate_result to define your own convergence criteria, and raise a StopIteration when you're satisfied with the results.
After finding a few feasible solutions, restart optimization using the best feasible solution as a guess. Perhaps this will help other members of the population achieve feasibility. (OTOH I don't remember how the guess is used, so this might not be true.)
Just set maxiter to the number of iterations you're willing to wait for, and let differential_evolution do its best in the number of iterations allotted. If it doesn't think it has converged in the number of iterations allotted, that doesn't necessarily mean that the best solution it found isn't good enough for your purposes. It's all heuristics anyway; there's no guarantee that convergence=1 means that the global optimum has been found or that if convergence=0 the globally optimal solution hasn't been found. Ultimately, you are the judge of whether the solution is good enough.