python pickle python-multiprocessing scipy-optimize

Scipy.Optimize - multiprocessing and resuming on last iteration in Python

Scipy.Optimize has the option to enable multiprocessing by changing the "workers" parameter to > 1 (or -1). However, if I want to be able to resume the optimization from its last iteration, whilst still using the multiprocessing feature, this seems to be impossible.

I can resume the operation through the use of DifferentialEvolutionSolver (as per https://github.com/scipy/scipy/issues/6517), but having "workers" > 1 in DifferentialEvolutionSolver keeps me from pickling the object in order to maintain persistence between sessions.

import pickle
from scipy.optimize._differentialevolution import DifferentialEvolutionSolver  

bounds = [(-1, 1)]
bounds = bounds * 66

if __name__ == '__main__':

    solver = DifferentialEvolutionSolver(converter, bounds, disp=True, seed=9, workers=-1, 
    maxiter=1)
    for i in range(100):
            best_x, best_cost = next(solver)
            print(solver.population_energies.min())
            with open('solver_%d.pkl' % i, 'wb') as f:
                 pickle.dump(solver, f)

This code generates the following error, as soon as it tries to pickle its first run:

NotImplementedError: pool objects cannot be passed between processes or pickled

However, if I use "workers=1", the code works fine, but it's obviously much slower.

Is there any way to get both multiprocessing and the ability to save each iteration along the way?

Solution

Bear in mind that DifferentialEvolutionSolver is not part of the public API of SciPy, and it is liable to change. The ability to change is required for improved performance, or re-engineering. The public facing function with backwards compatibility is differential_evolution. If you're prepared to cope with this, then you can use the following:

import pickle
from multiprocessing import Pool

from scipy.optimize import rosen
from scipy.optimize._differentialevolution import DifferentialEvolutionSolver  

bounds = [(-2, 2), (-2, 2)]

with Pool(2) as p:
    # make sure that `updating='deferred'`, as this is required for
    # parallelisation
    solver = DifferentialEvolutionSolver(
        rosen,
        bounds,
        seed=9,
        workers=1,
        updating='deferred',
    )

    for i in range(100):
        # the _mapwrapper attribute needs to be a map-like
        solver._mapwrapper = p.map
        best_x, best_cost = next(solver)
        print(solver.population_energies.min())
        solver._mapwrapper = map
        with open(f"solver_{i}.pkl", 'wb') as f:
             pickle.dump(solver, f)