Search code examples
pythonpicklepython-multiprocessingscipy-optimize

Scipy.Optimize - multiprocessing and resuming on last iteration in Python


Scipy.Optimize has the option to enable multiprocessing by changing the "workers" parameter to > 1 (or -1). However, if I want to be able to resume the optimization from its last iteration, whilst still using the multiprocessing feature, this seems to be impossible.

I can resume the operation through the use of DifferentialEvolutionSolver (as per https://github.com/scipy/scipy/issues/6517), but having "workers" > 1 in DifferentialEvolutionSolver keeps me from pickling the object in order to maintain persistence between sessions.

import pickle
from scipy.optimize._differentialevolution import DifferentialEvolutionSolver  

bounds = [(-1, 1)]
bounds = bounds * 66

if __name__ == '__main__':

    solver = DifferentialEvolutionSolver(converter, bounds, disp=True, seed=9, workers=-1, 
    maxiter=1)
    for i in range(100):
            best_x, best_cost = next(solver)
            print(solver.population_energies.min())
            with open('solver_%d.pkl' % i, 'wb') as f:
                 pickle.dump(solver, f)

This code generates the following error, as soon as it tries to pickle its first run:

NotImplementedError: pool objects cannot be passed between processes or pickled

However, if I use "workers=1", the code works fine, but it's obviously much slower.

Is there any way to get both multiprocessing and the ability to save each iteration along the way?


Solution

  • Bear in mind that DifferentialEvolutionSolver is not part of the public API of SciPy, and it is liable to change. The ability to change is required for improved performance, or re-engineering. The public facing function with backwards compatibility is differential_evolution. If you're prepared to cope with this, then you can use the following:

    import pickle
    from multiprocessing import Pool
    
    from scipy.optimize import rosen
    from scipy.optimize._differentialevolution import DifferentialEvolutionSolver  
    
    bounds = [(-2, 2), (-2, 2)]
    
    with Pool(2) as p:
        # make sure that `updating='deferred'`, as this is required for
        # parallelisation
        solver = DifferentialEvolutionSolver(
            rosen,
            bounds,
            seed=9,
            workers=1,
            updating='deferred',
        )
    
        for i in range(100):
            # the _mapwrapper attribute needs to be a map-like
            solver._mapwrapper = p.map
            best_x, best_cost = next(solver)
            print(solver.population_energies.min())
            solver._mapwrapper = map
            with open(f"solver_{i}.pkl", 'wb') as f:
                 pickle.dump(solver, f)